TransformedDataset
minnt.TransformedDataset
Bases: Dataset
A dataset capable of applying transformations to its items and batches.
The transformation of individual items is specified by overriding the transform property.
The transformation of batches can be performed by overriding collate
(processing a list of items into a batch) and/or transform_batch
(modifying the batch after collation). When any of these is overridden, the collate_fn
of a torch.utils.data.DataLoader must be set to self.collate_fn, which is automatically done when using
the dataloader method of this class.
Source code in minnt/transformed_dataset.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
__init__
Create a new transformed dataset using the provided dataset with an optional limit.
Parameters:
-
dataset(Dataset) –The source dataset implementing
__len__and__getitem__. -
dataset_limit(int | None, default:None) –If given, limits the length of the dataset to this value.
Environment variables: The following environment variable can be used to override the method parameters:
MINNT_DATASET_LIMIT: If set to a positive integer, overrides thedataset_limitparameter.
Source code in minnt/transformed_dataset.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
__len__
__len__() -> int
Return the number of items in the dataset.
Source code in minnt/transformed_dataset.py
43 44 45 | |
__getitem__
Return the item at the specified index.
Source code in minnt/transformed_dataset.py
47 48 49 50 51 52 | |
transform
class-attribute
instance-attribute
transform: Callable | None = None
If given, transform is called on each item before returning it.
If the dataset item is a tuple, transform is called with the tuple unpacked.
collate
class-attribute
instance-attribute
collate: Callable | None = None
If given, collate is called on a list of items before returning them as a batch.
transform_batch
class-attribute
instance-attribute
transform_batch: Callable | None = None
If given, transform_batch is called on a batch before returning it.
collate_fn
A function for a DataLoader to collate a batch of items using collate and/or transform_batch.
This function is used as the collate_fn parameter of a DataLoader when collate or transform_batch is set.
Parameters:
Source code in minnt/transformed_dataset.py
71 72 73 74 75 76 77 78 79 80 81 82 | |
dataloader
dataloader(
batch_size=1, *, shuffle=False, seed=None, num_workers=0, **kwargs
) -> DataLoader
Create a DataLoader for this dataset.
This method is a convenience wrapper around torch.utils.data.DataLoader setting up the required parameters. Most arguments are passed directly to the torch.utils.data.DataLoader, with a few exceptions:
- When
seedis given, it is used to construct thegeneratorargument for the DataLoader usingtorch.Generator().manual_seed(seed); thegeneratoroptions must not be specified inkwargs. - When
shuffleisFalseand nogeneratoris given,torch.Generator()is passed asgenerator. Otherwise, the global random number generator would be used during every construction of an iterator, i.e. during everyiter(dataloader)call. - When
num_workersis greater than 0,persistent_workersis set to True. - When
collateortransform_batchis set, theself.collate_fnis passed as thecollate_fnparameter.
Source code in minnt/transformed_dataset.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |