TransformedDataset
minnt.TransformedDataset
Bases: Dataset
A dataset capable of applying transformations to its items and batches.
The transformation of individual items is specified by overriding the transform property.
The transformation of batches can be performed by overriding collate
(processing a list of items into a batch) and/or transform_batch
(modifying the batch after collation). When any of these is overridden, the collate_fn
of a torch.utils.data.DataLoader must be set to self.collate_fn
, which is automatically done when using
the dataloader method of this class.
Source code in minnt/transformed_dataset.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
|
__init__
Create a new transformed dataset using the provided dataset with an optional limit.
Parameters:
-
dataset
(Dataset
) –The source dataset implementing
__len__
and__getitem__
. -
dataset_limit
(int | None
, default:None
) –If given, limits the length of the dataset to this value.
Environment variables: The following environment variable can be used to override the method parameters:
MINNT_DATASET_LIMIT
: If set to a positive integer, overrides thedataset_limit
parameter.
Source code in minnt/transformed_dataset.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
__len__
__len__() -> int
Return the number of items in the dataset.
Source code in minnt/transformed_dataset.py
43 44 45 |
|
__getitem__
Return the item at the specified index.
Source code in minnt/transformed_dataset.py
47 48 49 50 51 52 |
|
transform
class-attribute
instance-attribute
transform: Callable | None = None
If given, transform
is called on each item before returning it.
If the dataset item is a tuple, transform
is called with the tuple unpacked.
collate
class-attribute
instance-attribute
collate: Callable | None = None
If given, collate
is called on a list of items before returning them as a batch.
transform_batch
class-attribute
instance-attribute
transform_batch: Callable | None = None
If given, transform_batch
is called on a batch before returning it.
collate_fn
A function for a DataLoader to collate a batch of items using collate
and/or transform_batch
.
This function is used as the collate_fn
parameter of a DataLoader when collate
or transform_batch
is set.
Parameters:
Source code in minnt/transformed_dataset.py
71 72 73 74 75 76 77 78 79 80 81 82 |
|
dataloader
dataloader(
batch_size=1, *, shuffle=False, seed=None, num_workers=0, **kwargs
) -> DataLoader
Create a DataLoader for this dataset.
This method is a convenience wrapper around torch.utils.data.DataLoader setting up the required parameters. Most arguments are passed directly to the torch.utils.data.DataLoader, with a few exceptions:
- When
seed
is given, it is used to construct thegenerator
argument for the DataLoader usingtorch.Generator().manual_seed(seed)
; thegenerator
options must not be specified inkwargs
. - When
shuffle
isFalse
and nogenerator
is given,torch.Generator()
is passed asgenerator
. Otherwise, the global random number generator would be used during every construction of an iterator, i.e. during everyiter(dataloader)
call. - When
num_workers
is greater than 0,persistent_workers
is set to True. - When
collate
ortransform_batch
is set, theself.collate_fn
is passed as thecollate_fn
parameter.
Source code in minnt/transformed_dataset.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
|