Models¶

Definitions of models and corresponding layers

This folder contains all the definition of model architectures, including common layers that could share among different models.

U-Net¶

Definition of customized U-Net like model architecture.

class omnizart.models.u_net.MultiHeadAttention(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Attention layer for 2D input feature.

As the attention mechanism consumes a large amount of memory, here we leverage a divide-and-conquer approach implemented in the tensor2tensor repository. The input feature is first partitioned into smaller parts before being passed to do self-attention computation. The processed outputs are then assembled back into the same size as the input.

Parameters

out_channel: int: Number of output channels.
d_model: int: Dimension of embeddings for each position of input feature.
n_heads: int: Number of heads for multi-head attention computation. Should be division of d_model.
query_shape: Tuple(int, int): Size of each partition.
memory_flange: Tuple(int, int): Additional overlapping size to be extended to each partition, indicating the final size to be computed is: (query_shape[0]+memory_flange[0]) x (query_shape[1]+memory_flange[1])

References

This approach is originated from [1].

1: Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran, “Image Transformer,” in Proceedings of the 35th International Conference on Machine Learning (ICML), 2018

Methods

`call`(inputs)	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(inputs)¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

omnizart.models.u_net.conv_block(input_tensor, channel, kernel_size, strides=(2, 2), dilation_rate=1, dropout_rate=0.4)¶

Convolutional encoder block of U-net.

The block is a fully convolutional block. The encoder block does not downsample the input feature, and thus the output will have the same dimension as the input.

omnizart.models.u_net.semantic_segmentation(feature_num=352, timesteps=256, multi_grid_layer_n=1, multi_grid_n=5, ch_num=1, out_class=2, dropout=0.4)¶: Improved U-net model with Atrous Spatial Pyramid Pooling (ASPP) block.

omnizart.models.u_net.semantic_segmentation_attn(feature_num=352, timesteps=256, ch_num=1, out_class=2)¶: Customized attention U-net model.

omnizart.models.u_net.transpose_conv_block(input_tensor, channel, kernel_size, strides=(2, 2), dropout_rate=0.4)¶

Tensor2Tensor¶

Implementation of memory efficient attention.

Original implemetation are from tensor2tensor. Rewrite in tensorflow 2.0.

class omnizart.models.t2t.MultiHeadAttention(*args, **kwargs)¶

Multi-head attention keras layer wrapper

Methods

`call`(q, k, v)	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(q, k, v)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

omnizart.models.t2t.cast_like(x, y)¶: Cast x to y’s dtype, if necessary.

omnizart.models.t2t.combine_heads_2d(x)¶

Inverse of split_heads_2d.

Parameters

x: A Tensor with shape [batch, num_heads, height, width, channels / num_heads]

Returns

y: A Tensor with shape [batch, height, width, channels]

omnizart.models.t2t.combine_last_two_dimensions(x)¶

Reshape x so that the last two dimension become one.

Parameters

x: A Tensor with shape […, a, b]

Returns

y: A Tensor with shape […, ab]

omnizart.models.t2t.dot_product_attention(q, k, v, bias, dropout_rate=0.0, name=None, save_weights_to=None, dropout_broadcast_dims=None, activation_dtype=None, weight_dtype=None)¶

Dot-product attention.

Parameters

q: Tensor with shape […, length_q, depth_k].
k: Tensor with shape […, length_kv, depth_k]. Leading dimensions must match with q.
v: Tensor with shape […, length_kv, depth_v] Leading dimensions must match with q.
bias: Bias Tensor (see attention_bias())
dropout_rate: float: Dropout rate of layers.
image_shapes: tuple: Optional tuple of integer scalars.
name: str: An optional string
save_weights_to: dict: An optional dictionary to capture attention weights for visualization; the weights tensor will be appended there under a string key created from the variable scope (including name).
dropout_broadcast_dims: list: An optional list of integers less than rank of q. Specifies in which dimensions to broadcast the dropout decisions.
activation_dtype:: Used to define function activation dtype when using mixed precision.
weight_dtype:: The dtype weights are stored in when using mixed precision

Returns

y: Tensor with shape […, length_q, depth_v].

omnizart.models.t2t.dropout_with_broadcast_dims(x, keep_prob, broadcast_dims=None, **kwargs)¶

Like tf.nn.dropout but takes broadcast_dims instead of noise_shape.

Instead of specifying noise_shape, this function takes broadcast_dims - a list of dimension numbers in which noise_shape should be 1. The random keep/drop tensor has dimensionality 1 along these dimensions.

Parameters

x: float: A floating point tensor.
keep_prob: A scalar Tensor with the same type as x. The probability that each element is kept.
broadcast_dims: int: An optional list of integers the dimensions along which to broadcast the keep/drop flags.
**kwargs: keyword arguments to tf.nn.dropout other than “noise_shape”.

Returns

y: Tensor of the same shape as x.

omnizart.models.t2t.embedding_to_padding(emb)¶

Calculates the padding mask based on which embeddings are all zero.

We have hacked symbol_modality to return all-zero embeddings for padding.

Parameters

emb:: A Tensor with shape […, depth].

Returns

y: A float Tensor with shape […]. Each element is 1 if its corresponding embedding vector is all zero, and is 0 otherwise.

omnizart.models.t2t.gather_blocks_2d(x, indices)¶: Gathers flattened blocks from x.

omnizart.models.t2t.gather_indices_2d(x, block_shape, block_stride)¶: Getting gather indices.

omnizart.models.t2t.local_attention_2d(q, k, v, query_shape=(8, 16), memory_flange=(8, 16), name=None)¶

Strided block local self-attention.

The 2-D sequence is divided into 2-D blocks of shape query_shape. Attention for a given query position can only see memory positions less than or equal to the query position. The memory positions are the corresponding block with memory_flange many positions to add to the height and width of the block (namely, left, top, and right).

Parameters

q: A tensor with shape [batch, heads, h, w, depth_k]
k: A tensor with shape [batch, heads, h, w, depth_k]
v: A tensor with shape [batch, heads, h, w, depth_v]. In the current implementation, depth_v must be equal to depth_k.
query_shape: tuple: An tuple indicating the height and width of each query block.
memory_flange: tuple: An integer indicating how much to look in height and width from each query block.
name: str: An optional string

Returns

y: A Tensor of shape [batch, heads, h, w, depth_v]

omnizart.models.t2t.maybe_upcast(logits, activation_dtype=None, weight_dtype=None, hparams=None)¶

omnizart.models.t2t.mixed_precision_is_enabled(activation_dtype=None, weight_dtype=None, hparams=None)¶

omnizart.models.t2t.pad_to_multiple_2d(x, block_shape)¶

Making sure x is a multiple of shape.

Parameters

x: A [batch, heads, h, w, depth] or [batch, h, w, depth] tensor
block_shape: A 2D list of integer shapes

Returns

padded_x: A [batch, heads, h, w, depth] or [batch, h, w, depth] tensor

omnizart.models.t2t.positional_encoding(batch_size, timesteps, n_units=512, zero_pad=False, scale=False)¶

omnizart.models.t2t.relative_positional_encoding(n_steps, n_units=512, max_dist=2)¶

omnizart.models.t2t.reshape_range(tensor, i, j, shape)¶: Reshapes a tensor between dimensions i and j.

omnizart.models.t2t.scatter_blocks_2d(x, indices, shape)¶: scatters blocks from x into shape with indices.

omnizart.models.t2t.split_heads_2d(x, num_heads)¶

Split channels (dimension 3) into multiple heads (becomes dimension 1).

Parameters

x: A tensor with shape [batch, height, width, channels]
num_heads: int: Number of heads in attention’s computation.

Returns

y: A tensor with shape [batch, num_heads, height, width, channels / num_heads]

omnizart.models.t2t.split_last_dimension(x, n)¶

Reshape x so that the last dimension becomes two dimensions.

The first of these two dimensions is n.

Parameters

x: A Tensor with shape […, m]
n: int: An integer.

Returns

y: A Tensor with shape […, n, m/n]

Spectral Normalization Model¶

Transcription model of drum leveraging spectral normalization.

The model was originally developed with tensorflow 1.12. We rewrite the model with tensorflow 2.3 module and uses keras to implement most of the functionalities for better readability.

Original Author: I-Chieh, Wei Rewrite by: BreezeWhite

class omnizart.models.spectral_norm_net.ConvSN2D(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Just a wrapper layer for using spectral normalization.

Original implementation referes to here.

Methods

`call`(inputs)	This is where the layer's logic lives.
`get_config`()	This is neccessary to save the model architecture.

call(inputs)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶: This is neccessary to save the model architecture.

class omnizart.models.spectral_norm_net.SpectralNormalization(*args, **kwargs)¶

Bases: tensorflow.python.keras.layers.wrappers.Wrapper

Spectral normalization layer.

Original implementation referes to here.

Methods

`build`(input_shape)	Creates the variables of the layer (optional, for subclass implementers).
`call`(inputs)	This is where the layer's logic lives.

restore_weights
update_weights

build(input_shape)¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Args:

input_shape: Instance of TensorShape, or list of instances of: TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

restore_weights()¶

update_weights()¶

omnizart.models.spectral_norm_net.cnn_attention(x, channels, scope='attention')¶

omnizart.models.spectral_norm_net.conv_sa(x, channels, kernel=(4, 4), strides=(2, 2), pad=0, pad_type='zero', spectral_norm=True, scope='conv_0')¶

omnizart.models.spectral_norm_net.down_sample(x)¶

omnizart.models.spectral_norm_net.drum_model(out_classes, mini_beat_per_seg, res_block_num=3, channels=64, spectral_norm=True)¶

Get the drum transcription model.

Constructs the drum transcription model instance for training/inference.

Parameters

out_classes: int: Total output classes, refering to classes of drum types. Currently there are 13 pre-defined drum percussions.
mini_beat_per_seg: int: Number of mini beats in a segment. Can be understood as the range of time to be considered for training.
res_block_num: int: Number of residual blocks.

Returns

model: tf.keras.Model: A tensorflow keras model instance.

omnizart.models.spectral_norm_net.residual_block(x, channels, spectral_norm=True, scope='resblock')¶

omnizart.models.spectral_norm_net.transpose_residual_block(x, channels, to_down=True, spectral_norm=True, scope='transblock')¶

Chord Transformer¶

class omnizart.models.chord_model.ChordModel(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.training.Model

Chord model in written in keras.

Keras model of chord submodule. The original implementation is written in tensorflow 1.11 and can be found here.

The model also implements the custom training/test step due to the specialized loss computation.

Parameters

num_enc_attn_blocks: int: Number of attention blocks in the encoder.
num_dec_attn_blocks: int: Number of attention blocks in the decoder.
segment_width: int: Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.
freq_size: int: Feature size of the input representation.
out_classes: int: Number of output classes. Currently supports 26 types of chords.
n_steps: int: Time length of the feature.
enc_input_emb_size: int: Embedding size of the encoder’s input.
dec_input_emb_size: int: Embedding size of the decoder’s input.
dropout_rate: float: Dropout rate of all the dropout layers.
annealing_rate: float: Rate of modifying the slope value for each epoch.
**kwargs:: Other keyword parameters that will be passed to initialize the keras.Model.

See also

omnizart.chord.app.chord_loss_func: The customized loss computation function.

Methods

`call`(feature)	Calls the model on new inputs.
`get_config`()	Returns the config of the layer.
`test_step`(data)	The logic for one evaluation step.
`train_step`(data)	The logic for one training step.

step_in_slope

call(feature)¶

Calls the model on new inputs.

In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.

Args:

inputs: A tensor or list of tensors. training: Boolean or boolean scalar tensor, indicating whether to run

the Network in training mode or inference mode.

mask: A mask or list of masks. A mask can be: either a tensor or None (no mask).

Returns:

A tensor if there is a single output, or a list of tensors if there are more than one outputs.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

step_in_slope()¶

test_step(data)¶

The logic for one evaluation step.

This method can be overridden to support custom evaluation logic. This method is called by Model.make_test_function.

This function should contain the mathematical logic for one step of evaluation. This typically includes the forward pass, loss calculation, and metrics updates.

Configuration details for how this logic is run (e.g. tf.function and tf.distribute.Strategy settings), should be left to Model.make_test_function, which can also be overridden.

Args:: data: A nested structure of `Tensor`s.
Returns:: A dict containing values that will be passed to tf.keras.callbacks.CallbackList.on_train_batch_end. Typically, the values of the Model’s metrics are returned.

train_step(data)¶

The logic for one training step.

This method can be overridden to support custom training logic. This method is called by Model.make_train_function.

This method should contain the mathematical logic for one step of training. This typically includes the forward pass, loss calculation, backpropagation, and metric updates.

Configuration details for how this logic is run (e.g. tf.function and tf.distribute.Strategy settings), should be left to Model.make_train_function, which can also be overridden.

Args:: data: A nested structure of `Tensor`s.
Returns:: A dict containing values that will be passed to tf.keras.callbacks.CallbackList.on_train_batch_end. Typically, the values of the Model’s metrics are returned. Example: {‘loss’: 0.2, ‘accuracy’: 0.7}.

class omnizart.models.chord_model.Decoder(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Decoder layer of the transformer model.

Parameters

out_classes: int: Number of output classes. Currently supports 26 types of chords.
num_attn_blocks:: Number of attention blocks.
n_steps: int: Time length of the feature.
dec_input_emb_size: int: Embedding size of the decoder’s input.
segment_width: int: Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.
freq_size: int: Feature size of the input representation.
dropout_rate: float: Dropout rate of all the dropout layers.
**kwargs:: Other keyword parameters that will be passed to initialize keras.layers.Layer.

Methods

`call`(inp, encoder_input_emb, chord_change_pred)	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(inp, encoder_input_emb, chord_change_pred)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

class omnizart.models.chord_model.EncodeSegmentFrequency(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Encode feature along the frequency axis.

Parameters

n_units: int: Output embedding size.
n_steps: int: Time length of the feature.
segment_width: int: Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.
freq_size: int: Feature size of the input representation.
dropout_rate: float: Dropout rate of all dropout layers.

Methods

`call`(inp)	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(inp)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

class omnizart.models.chord_model.EncodeSegmentTime(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Encode feature along the time axis.

Parameters

n_units: int: Output embedding size.
n_steps: int: Time length of the feature.
segment_width: int: Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.
freq_size: int: Feature size of the input representation.
dropout_rate: float: Dropout rate of all dropout layers.

Methods

`call`(inp)	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(inp)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

class omnizart.models.chord_model.Encoder(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Encoder layer of the transformer model.

Parameters

num_attn_blocks:: Number of attention blocks.
n_steps: int: Time length of the feature.
enc_input_emb_size: int: Embedding size of the encoder’s input.
segment_width: int: Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.
freq_size: int: Feature size of the input representation.
dropout_rate: float: Dropout rate of all the dropout layers.
**kwargs:: Other keyword parameters that will be passed to initialize keras.layers.Layer.

Methods

`call`(inp[, slope])	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(inp, slope=1)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

class omnizart.models.chord_model.FeedForward(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Feedfoward layer of the transformer model.

Methods

`call`(inp)	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(inp)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

class omnizart.models.chord_model.ReduceSlope¶

Bases: tensorflow.python.keras.callbacks.Callback

Custom keras callback for reducing slope value after each epoch.

Methods

on_epoch_end(epoch[, logs])

Called at the end of an epoch.

on_epoch_end(epoch, logs=None)¶

Called at the end of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Args:

epoch: Integer, index of epoch. logs: Dict, metric results for this training epoch, and for the

validation epoch if validation is performed. Validation result keys are prefixed with val_. For training epoch, the values of the

Model’s metrics are returned. Example`{‘loss’: 0.2, ‘accuracy’:
0.7}`.

omnizart.models.chord_model.binary_round(inp, cast_to_int=False)¶

omnizart.models.chord_model.chord_block_compression(hidden_states, chord_changes)¶

omnizart.models.chord_model.chord_block_decompression(compressed_seq, block_ids)¶

Pyramid Net¶

class omnizart.models.pyramid_net.PyramidBlock(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Pyramid block for building pyramid net.

Methods

`call`(inputs[, is_training])	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(inputs, is_training=True)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

class omnizart.models.pyramid_net.PyramidNet(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.training.Model

Pyramid Net with shake drop layer.

Methods

`call`(inputs[, is_training])	Calls the model on new inputs.
`get_config`()	Returns the config of the layer.
`test_step`(data)	The logic for one evaluation step.
`train_step`(data)	The logic for one training step.

call(inputs, is_training=True)¶

Calls the model on new inputs.

In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

Args:

inputs: A tensor or list of tensors. training: Boolean or boolean scalar tensor, indicating whether to run

the Network in training mode or inference mode.

mask: A mask or list of masks. A mask can be: either a tensor or None (no mask).

Returns:

A tensor if there is a single output, or a list of tensors if there are more than one outputs.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

test_step(data)¶

The logic for one evaluation step.

This method can be overridden to support custom evaluation logic. This method is called by Model.make_test_function.

This function should contain the mathematical logic for one step of evaluation. This typically includes the forward pass, loss calculation, and metrics updates.

Configuration details for how this logic is run (e.g. tf.function and tf.distribute.Strategy settings), should be left to Model.make_test_function, which can also be overridden.

Args:: data: A nested structure of `Tensor`s.
Returns:: A dict containing values that will be passed to tf.keras.callbacks.CallbackList.on_train_batch_end. Typically, the values of the Model’s metrics are returned.

train_step(data)¶

The logic for one training step.

This method can be overridden to support custom training logic. This method is called by Model.make_train_function.

This method should contain the mathematical logic for one step of training. This typically includes the forward pass, loss calculation, backpropagation, and metric updates.

Configuration details for how this logic is run (e.g. tf.function and tf.distribute.Strategy settings), should be left to Model.make_train_function, which can also be overridden.

Args:: data: A nested structure of `Tensor`s.
Returns:: A dict containing values that will be passed to tf.keras.callbacks.CallbackList.on_train_batch_end. Typically, the values of the Model’s metrics are returned. Example: {‘loss’: 0.2, ‘accuracy’: 0.7}.

class omnizart.models.pyramid_net.ShakeDrop(*args, **kwargs)¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Shake drop layer.

Most of the code follows the implementation from tensorflow research [1].

References

1: https://github.com/tensorflow/models/blob/master/research/autoaugment/shake_drop.py

Methods

`call`(inputs[, is_training])	This is where the layer's logic lives.
`get_config`()	Returns the config of the layer.

call(inputs, is_training=True)¶

This is where the layer’s logic lives.

Args:: inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.
Returns:: A tensor or list/tuple of tensors.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

Utils¶

omnizart.models.utils.shape_list(input_tensor)¶: Return list of dims, statically where possible.