Releases: Lightning-Universe/lightning-flash
Compatibility patch
[0.8.2] - 2023-06-30
Changed
- Added GATE backbone for Tabular integrations (#1559)
Fixed
- Fixed datamodule can't load files with square brackets in names (#1501)
- Fixed channel dim selection on segmentation target (#1509)
- Fixed used of
jsonargparseavoiding reliance on non-public internal logic (#1620) - Compatibility with
pytorch-tabular>=1.0(#1545) - Compatibility latest
numpy(#1595)
New Contributors
- @kjappelbaum made their first contribution in #1503
- @yurijmikhalevich made their first contribution in #1501
- @pirj made their first contribution in #1520
- @ArjunSharda made their first contribution in #1517
- @izikgo made their first contribution in #1509
- @manujosephv made their first contribution in #1559
- @mauvilsa made their first contribution in #1620
Full Changelog: 0.8.1.post0...0.8.2
Dependency's adjustments
What's Changed
- fixed type of 'n_gram' from bool to int in TranslationTask by @BrightXiaoHan in #1486
- pinned
torchmetricsversion for compatibility by @Borda in #1495 - pinned
sahito fix object detection when installing in a fresh environment by @ethanwharris in #1496 - pinned
numpyfor type compatibility by @Borda in #1504
New Contributors
- @BrightXiaoHan made their first contribution in #1486
- @kjappelbaum made their first contribution in #1503
Full Changelog: 0.8.1...0.8.1.post0
Minor compatibility patch
What's Changed
- Add CLIP backbones for text / image classification by @ethanwharris in #1458
- Replace DP/DDP/DDPSpawn plugins to strategies, keep the old for compatibility by @krshrimali in #1451
- Integration of
lightning_utiltiesfunction intoflashby @uakarsh in #1457 - refactored
image_classifier_headtoclassifier_headby @Abelarm in #1464 - Raise better error if
icevisionnot installed if module isn't found (loading data) by @krshrimali in #1474 - Add support for Lightning 1.8 + Fixes for the CI by @krshrimali in #1470 and #1479
- Fix compatibility with TM 0.10 by @ethanwharris in #1469
New Contributors
Full Changelog: 0.8.0...0.8.1
TPU Support, Remote Data Loading, Video Classification from tensors: Feature Rich Release
We are elated to announce the release of Lightning Flash v0.8, a feature-rich release with improved testing to ensure better user experience for all our lovely users! The team at Lightning AI and our community contributors have been working hard for this release, and nothing makes us happier to share all their lovely contributions with you.
We discuss major features and changes below. For a curated list, scroll to the bottom to see all the pull requests included for this release.
TPU Support 🦸🏻
Before this release, Lightning Flash worked well on a single-core TPU (training, validation, and prediction), but failed comprehensively on multiple cores. This release has enabled training and validation support for multi-core TPUs, allowing users to try out their models on TPUs using Lightning Flash. Prediction of multi-core TPUs is an ongoing effort, and we hope to bring it to you in the near future.
| Before v0.8 | After v0.8 | |
|---|---|---|
| Single core | Training, Validation, Prediction | Training, Validation, Prediction |
| Multiple cores | Not supported | Training, Validation |
As we move ahead, and we see more users trying the TPUs with Lightning Flash, we expect that there might be unseen errors or issues, and we will be looking forward to addressing them as we get a chance. So please don't hesitate to let us know your experience!
Remote Data Loading: fsspec arrives into Lightning Flash ☁️
Before this release, users had to download a dataset or a file from the URL and pass it to our data loader classes. This was a pain point that we are happy to let go of in this release. Starting v0.8, you'll not have to download any of those files locally, and you can just pass the file URL - and expect it to work!
| Before v0.8 | After v0.8 | |
|---|---|---|
| Example |
Download titanic.csv from the URL and pass the path to the train_file argument:
from flash.tabular import TabularClassificationData
datamodule = TabularClassificationData.from_csv(
categorical_fields=["Age", "Cabin"],
numerical_fields="Fare",
target_fields="Survived",
train_file="titanic.csv",
val_split=0.1,
batch_size=8,
) |
Just pass the URL to train_file argument: from flash.tabular import TabularClassificationData
datamodule = TabularClassificationData.from_csv(
categorical_fields=["Age", "Cabin"],
numerical_fields="Fare",
target_fields="Survived",
train_file="https://pl-flash-data.s3.amazonaws.com/titanic.csv",
val_split=0.1,
batch_size=8,
) |
For more details, feel free to check out the documentation here.
Video Classification from Tensors 📹
At times, it's required to load raw data, or pre-process videos before progressing to loading data and training the model. These raw data for Video Classification, are mostly available as tensors, and before this release - one had to save them again in video files, and pass the paths to the data loading classes in Flash. Starting this release, we now support loading data from tensors for Video Classification.
import torch
from flash.video import VideoClassifier, VideoClassificationData
import flash
# 5 number of frames, 3 channels, height = 10 and width = 10
mock_tensors = torch.randint(size=(3, 5, 10, 10), low=0, high=255)
datamodule = VideoClassificationData.from_tensors(
train_data=[mock_tensors, mock_tensors], # can also stack: torch.stack((mock_tensors, mock_tensors))
train_targets=["patient", "doctor"],
predict_data=[mock_tensors],
batch_size=1,
)
model = VideoClassifier(num_classes=datamodule.num_classes, pretrained=False, backbone="slow_r50", labels=datamodule.labels)
trainer = flash.Trainer(max_epochs=1)
trainer.finetune(model, datamodule=datamodule)This will also come in handy for those having multi-modal pipelines who don't want to save the output of a model to files and instead pass the raw data to the next model, saving you quite a lot of time wasted in the conversion process.
Refactored Transforms in Lightning Flash ⚙️
One of the community-driven contributions that we are proud to share. Before this release, a user had to pass an input transform class for each stage, which was cumbersome. With this release, you can just pass transform=<YourTransformClass> to the required method. This is a breaking change, and if you are not sure how to resolve this, please create an issue and we'll be happy to help!
| Before v0.8 | After v0.8 | |
|---|---|---|
| Example |
dm = XYZTask_DataModule.from_xyz(
train_file=train_file,
val_file=val_file,
test_file=test_file,
predict_file=predict_file,
train_transform=InputTransform,
val_transform=InputTransform,
test_transform=InputTransform,
predict_transform=InputTransform,
transform_kwargs=transform_kwargs,
) |
dm = XYZTask_DataModule.from_xyz(
train_file=train_file,
val_file=val_file,
test_file=test_file,
predict_file=predict_file,
transform=InputTransform(**transform_kwargs),
) |
Note that, within your InputTransform class, you can have <stage>_per_batch_transform_on_device methods to support various stages.
class SampleInputTransform(InputTransform):
def per_sample_transform(self):
def fn(x):
return x
return fn
def train_per_batch_transform_on_device(self) -> Callable:
return ...
def val_per_batch_transform_on_device(self) -> Callable:
return ...
def test_per_batch_transform_on_device(self) -> Callable:
return ...
def predict_per_batch_transform_on_device(self) -> Callable:
return ...Object Detection in Flash is now servable 💁
If you aren't aware yet, Lightning Flash supports serving models. Starting this release, Object Detection is added to the beautiful category of tasks that can be served using Lightning Flash. Below is an example of how the inference server code for object detection will look like:
# Inference Server
from flash.image import ObjectDetector
model = ObjectDetector.load_from_checkpoint("https://flash-weights.s3.amazonaws.com/0.8.0/object_detection_model.pt")
model.serve()For more details, check out the documentation here.
Added
- Added support for
from_tensorsforVideoClassification(#1389) - Added fine tuning strategies for DeepSpeed (with parameter loading and storing omitted) (#1377)
- Added
torchvisionas a requirement todatatype_audio.txtas it's used for Audio Classification (#1425) - Added
figsizeandlimit_nb_samplesfor showing batch images (#1381) - Added support for
from_listsfor Tabular Classification and Regression (#1337) - Added support for
from_dictsfor Tabular Classification and Regression (#1331) - Added support for using the
ImageEmbedderSSL training for all image classifier backbones (#1264) - Added support for audio file formats to
AudioClassificationData(#1085) - Added support for Flash serve to the
ObjectDetector(#1370) - Added support for loading
ImageClassificationDatafrom PIL images withfrom_images(#1372) - Added support for loading
ObjectDetectionDatawithfrom_numpy,from_images, andfrom_tensors(#1372) - Added support for remote data loading with fsspec (#1387)
- Added support for TSV files to
from_csvmethods (#1387) - Added support for more formats when loading audio files (#1387)
- Added support to use any task as an embedder by calling
as_embedder(#1396) - Added support for normalization of images in
SemanticSegmentationData(#1399)
Changed
- Changed the
ImageEmbedderdependency on VISSL to optional (#1276) - Changed the transforms in
SemanticSegmentationDatato use albumentations instead of Kornia (#1313)
Removed
- Removed support for audio files with
sd2extension, because SoundFile (for sd2 extension) doesn't accept fsspec objects. ([#1409](https://github.com/Lightning-AI/lightn...
Bi-Weekly Patch Release
[0.7.5] - 2022-05-11
Fixed
- Fixed image classification data show_train_batch for subplots with rows > 1. (#1315)
- Fixed support for all the versions (including the latest and older) of baal. (#1315)
- Fixed a bug where a loaded TabularClassifier or TabularRegressor checkpoint could not be served (#1324)
- Fixed a bug where the freeze_unfreeze and unfreeze_milestones finetuning strategies could not be used in tandem with a onecyclelr LR scheduler (#1329)
- Fixed a bug where the backbone learning rate would be divided by 10 when unfrozen if using the freeze_unfreeze or unfreeze_milestones strategies (#1329)
Contributors
@Borda @ethanwharris @kaushikb11 @krshrimali
If we forgot someone let us know 😃
Bi-Weekly Patch Release
[0.7.4] - 2022-04-27
Fixed
- Fixed a bug where LR schedulers from HuggingFace could not be used with newer versions of PyTorch Lightning (#1307)
- Fixed a bug where the default Flash zero configurations for
ObjectDetector,InstanceSegmentation, andKeypointDetectorwould error with the latest version of some requirements (#1306) - Fixed plain
LightningModulesupport for Flash data modules. (#1281)
Contributors
@Borda @ethanwharris @krshrimali @rohitgr7
If we forgot someone let us know 😃
Bi-Weekly Patch Release
[0.7.3] - 2022-04-13
Fixed
- Fixed a bug where some backbones were incorrectly listed as available for the
ObjectDetector,InstanceSegmentation, andKeypointDetector(#1267) - Fixed a bug where the backbone would not be frozen when finetuning the
SpeechRecognitiontask (#1275) - Fixed a bug where the backbone would not be frozen when finetuning the
QuestionAnsweringtask with certain model types (#1275)
Contributors
If we forgot someone let us know 😃
Bi-Weekly Patch Release
[0.7.2] - 2022-03-30
Fixed
- Fixed examples (question answering), where NLTK's
punktmodule needs to be downloaded first. (#1215) - Fixed normalizing inputs to video classification (#1213)
- Fixed a bug where
pretraining_transformsin theImageEmbedderwas never called. (1196) - Fixed a bug where
BASE_MODEL_NAMEwas not in the dict for dino and moco strategies. (1196) - Fixed support for
torch==1.11.0(#1234) - Fixed DDP spawn support for
ObjectDetector,InstanceSegmentation, andKeypointDetector(#1222) - Fixed a bug where
InstanceSegmentationwould fail if samples had an inconsistent number of bboxes, labels, and masks (these will now be treated as negative samples) (#1222) - Fixed a bug where collate functions were never called in the
ImageEmbedderclass. (#1217) - Fixed a bug where
ObjectDetector,InstanceSegmentation, andKeypointDetectorwould log train and validation metrics with the same name (#1252) - Fixed a bug where using
ReduceLROnPlateauwould raise an error (#1251) - Fixed GPU support for self-supervised training with the
ImageEmbedder(#1256)
Contributors
@aisensiy @andife @aniketmaurya @Borda @dudeperf3ct @ethanwharris @krshrimali
If we forgot someone let us know 😃
Bi-Weekly Patch Release
[0.7.1] - 2022-03-01
Added
- Added the normalization parameters of
torchvision.transforms.Normalizeastransform_kwargsin theImageClassificationInputTransform(#1178) - Added
available_outputsmethod to theTask(#1206)
Fixed
- Fixed a bug where DDP would not work with Flash tasks (#1182)
- Fixed DDP support for
VideoClassifier(#1189) - Fixed a bug where buffers in loss functions were not correctly registered in the
Task(#1203) - Fixed support for passing a sampler instance to
from_*methods / theDataModule(#1204)
Contributors
@aisensiy @AndresAlgaba @Borda @ethanwharris
If we forgot someone due to not matching commit email with GitHub account, let us know :]
PyTorch Tabular, Enhanced Data Loading and Stability
[0.7.0] - 2022-02-15
Added
- Added support for multi-label, space delimited, targets (#1076)
- Added support for tabular classification / regression backbones from PyTorch Tabular (#1098)
- Added Flash zero support for tabular regression (#1098)
- Added support for COCO annotations with non-default keypoint labels to
KeypointDetectionData.from_coco(#1102) - Added support for
from_csvandfrom_data_frametoVideoClassificationData(#1117) - Added support for
SemanticSegmentationData.from_folderswhere mask files have different extensions to the image files (#1130) - Added
FlashRegistryof Available Heads forflash.image.ImageClassifier(#1152) - Added support for
ObjectDetectionData.from_files(#1154) - Added support for passing the
Outputobject (or a string e.g."labels") to theflash.Trainer.predictmethod (#1157) - Added support for passing the
TargetFormatterobject tofrom_*methods for classification to override target handling (#1171)
Changed
- Changed
Wav2Vec2ProcessortoAutoProcessorand seperate it from backbone [optional] (#1075) - Renamed
ClassificationInputtoClassificationInputMixin(#1116) - Changed the default
learning_ratefor all tasks to beNone, corresponding to the default for your chosen optimizer (#1172)
Fixed
- Fixed a bug when not explicitly passing
embedding_sizesto theTabularClassifierandTabularRegressortasks (#1067) - Fixed a bug where under some circumstances transforms would not get called (#1072)
- Fixed a bug where prediction would sometimes give the wrong number of outputs (#1077)
- Fixed a bug where passing the
val_splitto theDataModulewould not have the desired effect (#1079) - Fixed a bug where passing
predict_data_frametoImageClassificationData.from_data_frameraised an error (#1088) - Fixed a bug where segmentation files / masks were loaded with an inconsistent ordering (#1094)
- Fixed a bug with
AudioClassificationData.from_numpy(#1096) - Fixed a bug when using
SpeechRecognitionData.from_filesfor training / validating / testing (#1097) - Fixed a bug when using
SpeechRecognitionData.from_csvorfrom_jsonwhen predicting without targets (#1097) - Fixed a bug where
SpeechRecognitionData.from_datasetsdid not work as expected (#1097) - Fixed a bug where loading data for prediction with
SemanticSegmentationData.from_foldersraised an error (#1101) - Fixed a bug when passing a
predict_folderargument tofrom_coco/from_voc/from_viain IceVision tasks (#1102) - Fixed
ObjectDetectionData.from_vocandObjectDetectionData.from_via(#1102) - Fixed a bug where
InstanceSegmentationData.from_cocowould raise an error if not using file-based masks (#1102) - Fixed
InstanceSegmentationData.from_voc(#1102) - Fixed a bug when loading tabular data for prediction without a target field / column (#1114)
- Fixed a bug when loading prediction data for graph classification without targets (#1121)
- Fixed a bug where loading Seq2Seq data for prediction would not work if the target field was not present (#1128)
- Fixed a bug where
from_fiftyoneclassmethods did not work correctly with apredict_dataset(#1136) - Fixed a bug where the
labelsproperty would returnNonewhen usingObjectDetectionData.from_fiftyone(#1136) - Fixed a bug where
TabularDatawould not work correctly with no categorical variables (#1144) - Fixed a bug where loading
TabularForecastingDatafor prediction would only yield a single sample per series (#1149) - Fixed a bug where backbones for the
ObjectDetector,KeypointDetector, andInstanceSegmentationtasks were not always frozen correctly when finetuning (#1163) - Fixed a bug where
DataModule.multi_labelwould sometimes beNonewhen it had been inferred to beFalse(#1165)
Removed
- Removed the
Seq2SeqDatabase class (useTranslationDataorSummarizationDatadirectly) (#1128) - Removed the ability to attach the
Outputobject directly to the model (#1157)
Contributors
@Actis92 @ajinkyaindulkar @bartonp2 @Borda @daMichaelB @ethanwharris @flozi00 @karthikrangasai @MikeTrizna
If we forgot someone due to not matching commit email with GitHub account, let us know :]