Models

Definitions of models and corresponding layers

This folder contains all the definition of model architectures, including common layers that could share among different models.

U-Net

Definition of customized U-Net like model architecture.

class omnizart.models.u_net.MultiHeadAttention(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Attention layer for 2D input feature.

As the attention mechanism consumes a large amount of memory, here we leverage a divide-and-conquer approach implemented in the tensor2tensor repository. The input feature is first partitioned into smaller parts before being passed to do self-attention computation. The processed outputs are then assembled back into the same size as the input.

Parameters
out_channel: int

Number of output channels.

d_model: int

Dimension of embeddings for each position of input feature.

n_heads: int

Number of heads for multi-head attention computation. Should be division of d_model.

query_shape: Tuple(int, int)

Size of each partition.

memory_flange: Tuple(int, int)

Additional overlapping size to be extended to each partition, indicating the final size to be computed is: (query_shape[0]+memory_flange[0]) x (query_shape[1]+memory_flange[1])

References

This approach is originated from [1].

1

Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran, “Image Transformer,” in Proceedings of the 35th International Conference on Machine Learning (ICML), 2018

Methods

call(inputs)

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(inputs)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

omnizart.models.u_net.conv_block(input_tensor, channel, kernel_size, strides=(2, 2), dilation_rate=1, dropout_rate=0.4)

Convolutional encoder block of U-net.

The block is a fully convolutional block. The encoder block does not downsample the input feature, and thus the output will have the same dimension as the input.

omnizart.models.u_net.semantic_segmentation(feature_num=352, timesteps=256, multi_grid_layer_n=1, multi_grid_n=5, ch_num=1, out_class=2, dropout=0.4)

Improved U-net model with Atrous Spatial Pyramid Pooling (ASPP) block.

omnizart.models.u_net.semantic_segmentation_attn(feature_num=352, timesteps=256, ch_num=1, out_class=2)

Customized attention U-net model.

omnizart.models.u_net.transpose_conv_block(input_tensor, channel, kernel_size, strides=(2, 2), dropout_rate=0.4)

Tensor2Tensor

Implementation of memory efficient attention.

Original implemetation are from tensor2tensor. Rewrite in tensorflow 2.0.

class omnizart.models.t2t.MultiHeadAttention(*args, **kwargs)

Multi-head attention keras layer wrapper

Methods

call(q, k, v)

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(q, k, v)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

omnizart.models.t2t.cast_like(x, y)

Cast x to y’s dtype, if necessary.

omnizart.models.t2t.combine_heads_2d(x)

Inverse of split_heads_2d.

Parameters
x

A Tensor with shape [batch, num_heads, height, width, channels / num_heads]

Returns
y

A Tensor with shape [batch, height, width, channels]

omnizart.models.t2t.combine_last_two_dimensions(x)

Reshape x so that the last two dimension become one.

Parameters
x

A Tensor with shape […, a, b]

Returns
y

A Tensor with shape […, ab]

omnizart.models.t2t.dot_product_attention(q, k, v, bias, dropout_rate=0.0, name=None, save_weights_to=None, dropout_broadcast_dims=None, activation_dtype=None, weight_dtype=None)

Dot-product attention.

Parameters
q

Tensor with shape […, length_q, depth_k].

k

Tensor with shape […, length_kv, depth_k]. Leading dimensions must match with q.

v

Tensor with shape […, length_kv, depth_v] Leading dimensions must match with q.

bias

Bias Tensor (see attention_bias())

dropout_rate: float

Dropout rate of layers.

image_shapes: tuple

Optional tuple of integer scalars.

name: str

An optional string

save_weights_to: dict

An optional dictionary to capture attention weights for visualization; the weights tensor will be appended there under a string key created from the variable scope (including name).

dropout_broadcast_dims: list

An optional list of integers less than rank of q. Specifies in which dimensions to broadcast the dropout decisions.

activation_dtype:

Used to define function activation dtype when using mixed precision.

weight_dtype:

The dtype weights are stored in when using mixed precision

Returns
y

Tensor with shape […, length_q, depth_v].

omnizart.models.t2t.dropout_with_broadcast_dims(x, keep_prob, broadcast_dims=None, **kwargs)

Like tf.nn.dropout but takes broadcast_dims instead of noise_shape.

Instead of specifying noise_shape, this function takes broadcast_dims - a list of dimension numbers in which noise_shape should be 1. The random keep/drop tensor has dimensionality 1 along these dimensions.

Parameters
x: float

A floating point tensor.

keep_prob

A scalar Tensor with the same type as x. The probability that each element is kept.

broadcast_dims: int

An optional list of integers the dimensions along which to broadcast the keep/drop flags.

**kwargs

keyword arguments to tf.nn.dropout other than “noise_shape”.

Returns
y

Tensor of the same shape as x.

omnizart.models.t2t.embedding_to_padding(emb)

Calculates the padding mask based on which embeddings are all zero.

We have hacked symbol_modality to return all-zero embeddings for padding.

Parameters
emb:

A Tensor with shape […, depth].

Returns
y

A float Tensor with shape […]. Each element is 1 if its corresponding embedding vector is all zero, and is 0 otherwise.

omnizart.models.t2t.gather_blocks_2d(x, indices)

Gathers flattened blocks from x.

omnizart.models.t2t.gather_indices_2d(x, block_shape, block_stride)

Getting gather indices.

omnizart.models.t2t.local_attention_2d(q, k, v, query_shape=(8, 16), memory_flange=(8, 16), name=None)

Strided block local self-attention.

The 2-D sequence is divided into 2-D blocks of shape query_shape. Attention for a given query position can only see memory positions less than or equal to the query position. The memory positions are the corresponding block with memory_flange many positions to add to the height and width of the block (namely, left, top, and right).

Parameters
q

A tensor with shape [batch, heads, h, w, depth_k]

k

A tensor with shape [batch, heads, h, w, depth_k]

v

A tensor with shape [batch, heads, h, w, depth_v]. In the current implementation, depth_v must be equal to depth_k.

query_shape: tuple

An tuple indicating the height and width of each query block.

memory_flange: tuple

An integer indicating how much to look in height and width from each query block.

name: str

An optional string

Returns
y

A Tensor of shape [batch, heads, h, w, depth_v]

omnizart.models.t2t.maybe_upcast(logits, activation_dtype=None, weight_dtype=None, hparams=None)
omnizart.models.t2t.mixed_precision_is_enabled(activation_dtype=None, weight_dtype=None, hparams=None)
omnizart.models.t2t.pad_to_multiple_2d(x, block_shape)

Making sure x is a multiple of shape.

Parameters
x

A [batch, heads, h, w, depth] or [batch, h, w, depth] tensor

block_shape

A 2D list of integer shapes

Returns
padded_x

A [batch, heads, h, w, depth] or [batch, h, w, depth] tensor

omnizart.models.t2t.positional_encoding(batch_size, timesteps, n_units=512, zero_pad=False, scale=False)
omnizart.models.t2t.relative_positional_encoding(n_steps, n_units=512, max_dist=2)
omnizart.models.t2t.reshape_range(tensor, i, j, shape)

Reshapes a tensor between dimensions i and j.

omnizart.models.t2t.scatter_blocks_2d(x, indices, shape)

scatters blocks from x into shape with indices.

omnizart.models.t2t.split_heads_2d(x, num_heads)

Split channels (dimension 3) into multiple heads (becomes dimension 1).

Parameters
x

A tensor with shape [batch, height, width, channels]

num_heads: int

Number of heads in attention’s computation.

Returns
y

A tensor with shape [batch, num_heads, height, width, channels / num_heads]

omnizart.models.t2t.split_last_dimension(x, n)

Reshape x so that the last dimension becomes two dimensions.

The first of these two dimensions is n.

Parameters
x

A Tensor with shape […, m]

n: int

An integer.

Returns
y

A Tensor with shape […, n, m/n]

Spectral Normalization Model

Transcription model of drum leveraging spectral normalization.

The model was originally developed with tensorflow 1.12. We rewrite the model with tensorflow 2.3 module and uses keras to implement most of the functionalities for better readability.

Original Author: I-Chieh, Wei Rewrite by: BreezeWhite

class omnizart.models.spectral_norm_net.ConvSN2D(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Just a wrapper layer for using spectral normalization.

Original implementation referes to here.

Methods

call(inputs)

This is where the layer's logic lives.

get_config()

This is neccessary to save the model architecture.

call(inputs)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

This is neccessary to save the model architecture.

class omnizart.models.spectral_norm_net.SpectralNormalization(*args, **kwargs)

Bases: tensorflow.python.keras.layers.wrappers.Wrapper

Spectral normalization layer.

Original implementation referes to here.

Methods

build(input_shape)

Creates the variables of the layer (optional, for subclass implementers).

call(inputs)

This is where the layer's logic lives.

restore_weights

update_weights

build(input_shape)

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Args:
input_shape: Instance of TensorShape, or list of instances of

TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

restore_weights()
update_weights()
omnizart.models.spectral_norm_net.cnn_attention(x, channels, scope='attention')
omnizart.models.spectral_norm_net.conv_sa(x, channels, kernel=(4, 4), strides=(2, 2), pad=0, pad_type='zero', spectral_norm=True, scope='conv_0')
omnizart.models.spectral_norm_net.down_sample(x)
omnizart.models.spectral_norm_net.drum_model(out_classes, mini_beat_per_seg, res_block_num=3, channels=64, spectral_norm=True)

Get the drum transcription model.

Constructs the drum transcription model instance for training/inference.

Parameters
out_classes: int

Total output classes, refering to classes of drum types. Currently there are 13 pre-defined drum percussions.

mini_beat_per_seg: int

Number of mini beats in a segment. Can be understood as the range of time to be considered for training.

res_block_num: int

Number of residual blocks.

Returns
model: tf.keras.Model

A tensorflow keras model instance.

omnizart.models.spectral_norm_net.residual_block(x, channels, spectral_norm=True, scope='resblock')
omnizart.models.spectral_norm_net.transpose_residual_block(x, channels, to_down=True, spectral_norm=True, scope='transblock')

Chord Transformer

class omnizart.models.chord_model.ChordModel(*args, **kwargs)

Bases: tensorflow.python.keras.engine.training.Model

Chord model in written in keras.

Keras model of chord submodule. The original implementation is written in tensorflow 1.11 and can be found here.

The model also implements the custom training/test step due to the specialized loss computation.

Parameters
num_enc_attn_blocks: int

Number of attention blocks in the encoder.

num_dec_attn_blocks: int

Number of attention blocks in the decoder.

segment_width: int

Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.

freq_size: int

Feature size of the input representation.

out_classes: int

Number of output classes. Currently supports 26 types of chords.

n_steps: int

Time length of the feature.

enc_input_emb_size: int

Embedding size of the encoder’s input.

dec_input_emb_size: int

Embedding size of the decoder’s input.

dropout_rate: float

Dropout rate of all the dropout layers.

annealing_rate: float

Rate of modifying the slope value for each epoch.

**kwargs:

Other keyword parameters that will be passed to initialize the keras.Model.

See also

omnizart.chord.app.chord_loss_func

The customized loss computation function.

Methods

call(feature)

Calls the model on new inputs.

get_config()

Returns the config of the layer.

test_step(data)

The logic for one evaluation step.

train_step(data)

The logic for one training step.

step_in_slope

call(feature)

Calls the model on new inputs.

In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.

Args:

inputs: A tensor or list of tensors. training: Boolean or boolean scalar tensor, indicating whether to run

the Network in training mode or inference mode.

mask: A mask or list of masks. A mask can be

either a tensor or None (no mask).

Returns:

A tensor if there is a single output, or a list of tensors if there are more than one outputs.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

step_in_slope()
test_step(data)

The logic for one evaluation step.

This method can be overridden to support custom evaluation logic. This method is called by Model.make_test_function.

This function should contain the mathematical logic for one step of evaluation. This typically includes the forward pass, loss calculation, and metrics updates.

Configuration details for how this logic is run (e.g. tf.function and tf.distribute.Strategy settings), should be left to Model.make_test_function, which can also be overridden.

Args:

data: A nested structure of `Tensor`s.

Returns:

A dict containing values that will be passed to tf.keras.callbacks.CallbackList.on_train_batch_end. Typically, the values of the Model’s metrics are returned.

train_step(data)

The logic for one training step.

This method can be overridden to support custom training logic. This method is called by Model.make_train_function.

This method should contain the mathematical logic for one step of training. This typically includes the forward pass, loss calculation, backpropagation, and metric updates.

Configuration details for how this logic is run (e.g. tf.function and tf.distribute.Strategy settings), should be left to Model.make_train_function, which can also be overridden.

Args:

data: A nested structure of `Tensor`s.

Returns:

A dict containing values that will be passed to tf.keras.callbacks.CallbackList.on_train_batch_end. Typically, the values of the Model’s metrics are returned. Example: {‘loss’: 0.2, ‘accuracy’: 0.7}.

class omnizart.models.chord_model.Decoder(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Decoder layer of the transformer model.

Parameters
out_classes: int

Number of output classes. Currently supports 26 types of chords.

num_attn_blocks:

Number of attention blocks.

n_steps: int

Time length of the feature.

dec_input_emb_size: int

Embedding size of the decoder’s input.

segment_width: int

Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.

freq_size: int

Feature size of the input representation.

dropout_rate: float

Dropout rate of all the dropout layers.

**kwargs:

Other keyword parameters that will be passed to initialize keras.layers.Layer.

Methods

call(inp, encoder_input_emb, chord_change_pred)

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(inp, encoder_input_emb, chord_change_pred)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

class omnizart.models.chord_model.EncodeSegmentFrequency(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Encode feature along the frequency axis.

Parameters
n_units: int

Output embedding size.

n_steps: int

Time length of the feature.

segment_width: int

Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.

freq_size: int

Feature size of the input representation.

dropout_rate: float

Dropout rate of all dropout layers.

Methods

call(inp)

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(inp)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

class omnizart.models.chord_model.EncodeSegmentTime(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Encode feature along the time axis.

Parameters
n_units: int

Output embedding size.

n_steps: int

Time length of the feature.

segment_width: int

Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.

freq_size: int

Feature size of the input representation.

dropout_rate: float

Dropout rate of all dropout layers.

Methods

call(inp)

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(inp)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

class omnizart.models.chord_model.Encoder(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Encoder layer of the transformer model.

Parameters
num_attn_blocks:

Number of attention blocks.

n_steps: int

Time length of the feature.

enc_input_emb_size: int

Embedding size of the encoder’s input.

segment_width: int

Context width of each frame. Nearby frames will be concatenated to the feature axis. Default to 21, which means past 10 frames and future 10 frames will be concatenated to the current frame, resulting a feature dimenstion of segment_width x freq_size.

freq_size: int

Feature size of the input representation.

dropout_rate: float

Dropout rate of all the dropout layers.

**kwargs:

Other keyword parameters that will be passed to initialize keras.layers.Layer.

Methods

call(inp[, slope])

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(inp, slope=1)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

class omnizart.models.chord_model.FeedForward(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Feedfoward layer of the transformer model.

Methods

call(inp)

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(inp)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

class omnizart.models.chord_model.ReduceSlope

Bases: tensorflow.python.keras.callbacks.Callback

Custom keras callback for reducing slope value after each epoch.

Methods

on_epoch_end(epoch[, logs])

Called at the end of an epoch.

on_epoch_end(epoch, logs=None)

Called at the end of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Args:

epoch: Integer, index of epoch. logs: Dict, metric results for this training epoch, and for the

validation epoch if validation is performed. Validation result keys are prefixed with val_. For training epoch, the values of the

Model’s metrics are returned. Example`{‘loss’: 0.2, ‘accuracy’:

0.7}`.

omnizart.models.chord_model.binary_round(inp, cast_to_int=False)
omnizart.models.chord_model.chord_block_compression(hidden_states, chord_changes)
omnizart.models.chord_model.chord_block_decompression(compressed_seq, block_ids)

Pyramid Net

class omnizart.models.pyramid_net.PyramidBlock(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Pyramid block for building pyramid net.

Methods

call(inputs[, is_training])

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(inputs, is_training=True)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

class omnizart.models.pyramid_net.PyramidNet(*args, **kwargs)

Bases: tensorflow.python.keras.engine.training.Model

Pyramid Net with shake drop layer.

Methods

call(inputs[, is_training])

Calls the model on new inputs.

get_config()

Returns the config of the layer.

test_step(data)

The logic for one evaluation step.

train_step(data)

The logic for one training step.

call(inputs, is_training=True)

Calls the model on new inputs.

In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.

Args:

inputs: A tensor or list of tensors. training: Boolean or boolean scalar tensor, indicating whether to run

the Network in training mode or inference mode.

mask: A mask or list of masks. A mask can be

either a tensor or None (no mask).

Returns:

A tensor if there is a single output, or a list of tensors if there are more than one outputs.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

test_step(data)

The logic for one evaluation step.

This method can be overridden to support custom evaluation logic. This method is called by Model.make_test_function.

This function should contain the mathematical logic for one step of evaluation. This typically includes the forward pass, loss calculation, and metrics updates.

Configuration details for how this logic is run (e.g. tf.function and tf.distribute.Strategy settings), should be left to Model.make_test_function, which can also be overridden.

Args:

data: A nested structure of `Tensor`s.

Returns:

A dict containing values that will be passed to tf.keras.callbacks.CallbackList.on_train_batch_end. Typically, the values of the Model’s metrics are returned.

train_step(data)

The logic for one training step.

This method can be overridden to support custom training logic. This method is called by Model.make_train_function.

This method should contain the mathematical logic for one step of training. This typically includes the forward pass, loss calculation, backpropagation, and metric updates.

Configuration details for how this logic is run (e.g. tf.function and tf.distribute.Strategy settings), should be left to Model.make_train_function, which can also be overridden.

Args:

data: A nested structure of `Tensor`s.

Returns:

A dict containing values that will be passed to tf.keras.callbacks.CallbackList.on_train_batch_end. Typically, the values of the Model’s metrics are returned. Example: {‘loss’: 0.2, ‘accuracy’: 0.7}.

class omnizart.models.pyramid_net.ShakeDrop(*args, **kwargs)

Bases: tensorflow.python.keras.engine.base_layer.Layer

Shake drop layer.

Most of the code follows the implementation from tensorflow research [1].

References

1

https://github.com/tensorflow/models/blob/master/research/autoaugment/shake_drop.py

Methods

call(inputs[, is_training])

This is where the layer's logic lives.

get_config()

Returns the config of the layer.

call(inputs, is_training=True)

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args:

inputs: Input tensor, or list/tuple of input tensors. *args: Additional positional arguments. Currently unused. **kwargs: Additional keyword arguments. Currently unused.

Returns:

A tensor or list/tuple of tensors.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:

Python dictionary.

Utils

omnizart.models.utils.shape_list(input_tensor)

Return list of dims, statically where possible.