cadl package

Submodules

cadl.batch_norm module

Batch Normalization for TensorFlow.

cadl.batch_norm.batch_norm(x, phase_train, name='bn', decay=0.9, reuse=None, affine=True)[source]

Batch normalization on convolutional maps. from: https://stackoverflow.com/questions/33949786/how-could-i- use-batch-normalization-in-tensorflow Only modified to infer shape from input tensor x.

[DEPRECATED] Use tflearn or slim batch normalization instead.

Parameters:
  • x – Tensor, 4D BHWD input maps
  • phase_train – boolean tf.Variable, true indicates training phase
  • name – string, variable name
  • decay (float, optional) – Description
  • reuse (None, optional) – Description
  • affine – whether to affine-transform outputs
Returns:

batch-normalized maps

Return type:

normed

cadl.celeb_vaegan module

Tools for downloading the celeb dataset and model, including preprocessing.

cadl.celeb_vaegan.celeb_vaegan_download()[source]

Download a pretrained celeb vae/gan network.

Returns:Description
Return type:TYPE
cadl.celeb_vaegan.get_celeb_vaegan_model()[source]

Get a pretrained model.

Returns:net
{
‘graph_def’: tf.GraphDef
The graph definition
‘labels’: list
List of different possible attributes from celeb
‘attributes’: np.ndarray
One hot encoding of the attributes per image [n_els x n_labels]
‘preprocess’: function
Preprocess function

}

Return type:dict
cadl.celeb_vaegan.preprocess(img, crop_factor=0.8)[source]

Replicate the preprocessing we did on the VAE/GAN.

This model used a crop_factor of 0.8 and crop size of [100, 100, 3].

Parameters:
  • img (TYPE) – Description
  • crop_factor (float, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.charrnn module

Character-level Recurrent Neural Network.

cadl.charrnn.build_model(txt, batch_size=1, sequence_length=1, n_layers=2, n_cells=100, gradient_clip=10.0, learning_rate=0.001)[source]

Summary

Parameters:
  • txt (TYPE) – Description
  • batch_size (int, optional) – Description
  • sequence_length (int, optional) – Description
  • n_layers (int, optional) – Description
  • n_cells (int, optional) – Description
  • gradient_clip (float, optional) – Description
  • learning_rate (float, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.charrnn.infer(txt, ckpt_name, n_iterations, n_cells=200, n_layers=3, learning_rate=0.001, max_iter=5000, gradient_clip=10.0, init_value=[0], keep_prob=1.0, sampling='prob', temperature=1.0)[source]
Parameters:
  • txt (TYPE) – Description
  • ckpt_name (TYPE) – Description
  • n_iterations (TYPE) – Description
  • n_cells (int, optional) – Description
  • n_layers (int, optional) – Description
  • learning_rate (float, optional) – Description
  • max_iter (int, optional) – Description
  • gradient_clip (float, optional) – Description
  • init_value (list, optional) – Description
  • keep_prob (float, optional) – Description
  • sampling (str, optional) – Description
  • temperature (float, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.charrnn.test_alice(max_iter=5)[source]

Summary

Parameters:max_iter (int, optional) – Description
Returns:Description
Return type:TYPE
cadl.charrnn.test_trump(max_iter=100)[source]

Summary

Parameters:max_iter (int, optional) – Description
cadl.charrnn.test_wtc()[source]

Summary

cadl.charrnn.train(txt, batch_size=100, sequence_length=150, n_cells=200, n_layers=3, learning_rate=1e-05, max_iter=50000, gradient_clip=5.0, ckpt_name='model.ckpt', keep_prob=1.0)[source]
Parameters:
  • txt (TYPE) – Description
  • batch_size (int, optional) – Description
  • sequence_length (int, optional) – Description
  • n_cells (int, optional) – Description
  • n_layers (int, optional) – Description
  • learning_rate (float, optional) – Description
  • max_iter (int, optional) – Description
  • gradient_clip (float, optional) – Description
  • ckpt_name (str, optional) – Description
  • keep_prob (float, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cornell module

Tools for downloading and preprocessing the Cornell Movie DB.

cadl.cornell.download_cornell(dst='cornell movie-dialogs corpus')[source]

Summary

Parameters:dst (str, optional) – Description
cadl.cornell.get_characters(path='cornell movie-dialogs corpus')[source]
  • movie_characters_metadata.txt
    • contains information about each movie character
    • fields:
      • characterID
      • character name
      • movieID
      • movie title
      • gender (”?” for unlabeled cases)
      • position in credits (”?” for unlabeled cases)
Parameters:path (TYPE, optional) – Description
Returns:Description
Return type:TYPE
cadl.cornell.get_conversations(path='cornell movie-dialogs corpus')[source]
  • movie_conversations.txt
    • the structure of the conversations
    • fields
      • characterID of the first character involved in the conversation
      • characterID of the second character involved in the conversation
      • movieID of the movie in which the conversation occurred
      • list of the utterances that make the conversation, in
        chronological order: [‘lineID1’,’lineID2’,É,’lineIDN’] has to be matched with movie_lines.txt to reconstruct the actual content
Parameters:path (TYPE, optional) – Description
Returns:Description
Return type:TYPE
cadl.cornell.get_lines(path='cornell movie-dialogs corpus')[source]
  • movie_lines.txt
    • contains the actual text of each utterance
    • fields:
      • lineID
      • characterID (who uttered this phrase)
      • movieID
      • character name
      • text of the utterance
Parameters:path (TYPE, optional) – Description
Returns:Description
Return type:TYPE
cadl.cornell.get_scripts(path='cornell movie-dialogs corpus')[source]

Summary

Parameters:path (TYPE, optional) – Description
Returns:Description
Return type:TYPE
cadl.cornell.get_titles(path='cornell movie-dialogs corpus')[source]
  • movie_titles_metadata.txt
    • contains information about each movie title
    • fields:
      • movieID,
      • movie title,
      • movie year,
      • IMDB rating,
      • no. IMDB votes,
      • genres in the format [‘genre1’,’genre2’,É,’genreN’]
Parameters:path (TYPE, optional) – Description
Returns:Description
Return type:TYPE
cadl.cornell.id2word(ids, vocab)[source]

Summary

Parameters:
  • ids (TYPE) – Description
  • vocab (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.cornell.preprocess(text, min_count=10, max_length=40)[source]

Summary

Parameters:
  • text (TYPE) – Description
  • min_count (int, optional) – Description
  • max_length (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cornell.test_decode(sentence)[source]

Test decoding of cornell dataset with deprecated seq2seq model.

Parameters:sentence (TYPE) – Description
cadl.cornell.test_train()[source]

Test training of cornell dataset with deprecated bucketed seq2seq model.

cadl.cornell.word2id(words, vocab, UNK_ID=3)[source]

Summary

Parameters:
  • words (TYPE) – Description
  • vocab (TYPE) – Description
  • UNK_ID (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan module

Cycle Generative Adversarial Network for Unpaired Image to Image translation.

cadl.cycle_gan.batch_generator_dataset(imgs1, imgs2)[source]

Summary

Parameters:
  • imgs1 (TYPE) – Description
  • imgs2 (TYPE) – Description
Yields:

TYPE – Description

cadl.cycle_gan.batch_generator_random_crop(X, Y, min_size=256, max_size=512, n_images=100)[source]

Summary

Parameters:
  • X (TYPE) – Description
  • Y (TYPE) – Description
  • min_size (int, optional) – Description
  • max_size (int, optional) – Description
  • n_images (int, optional) – Description
Yields:

TYPE – Description

cadl.cycle_gan.conv2d(inputs, activation_fn=<function lrelu>, normalizer_fn=<function instance_norm>, scope='conv2d', **kwargs)[source]

Summary

Parameters:
  • inputs (TYPE) – Description
  • activation_fn (TYPE, optional) – Description
  • normalizer_fn (TYPE, optional) – Description
  • scope (str, optional) – Description
  • **kwargs – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.conv2d_transpose(inputs, activation_fn=<function lrelu>, normalizer_fn=<function instance_norm>, scope='conv2d_transpose', **kwargs)[source]

Summary

Parameters:
  • inputs (TYPE) – Description
  • activation_fn (TYPE, optional) – Description
  • normalizer_fn (TYPE, optional) – Description
  • scope (str, optional) – Description
  • **kwargs – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.cycle_gan(img_size=256)[source]

Summary

Parameters:img_size (int, optional) – Description
Returns:Description
Return type:TYPE
cadl.cycle_gan.decoder(x, n_filters=32, k_size=3, scope=None)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • n_filters (int, optional) – Description
  • k_size (int, optional) – Description
  • scope (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.discriminator(x, n_filters=64, k_size=4, scope=None, reuse=None)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • n_filters (int, optional) – Description
  • k_size (int, optional) – Description
  • scope (None, optional) – Description
  • reuse (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.encoder(x, n_filters=32, k_size=3, scope=None)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • n_filters (int, optional) – Description
  • k_size (int, optional) – Description
  • scope (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.generator(x, scope=None, reuse=None)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • scope (None, optional) – Description
  • reuse (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.get_images(path1, path2, img_size=256)[source]

Summary

Parameters:
  • path1 (TYPE) – Description
  • path2 (TYPE) – Description
  • img_size (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.instance_norm(x, epsilon=1e-05)[source]

Instance Normalization.

See Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2016). Instance Normalization: The Missing Ingredient for Fast Stylization, Retrieved from http://arxiv.org/abs/1607.08022

Parameters:
  • x (TYPE) – Description
  • epsilon (float, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.l1loss(x, y)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • y (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.l2loss(x, y)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • y (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.lrelu(x, leak=0.2, name='lrelu')[source]

Summary

Parameters:
  • x (TYPE) – Description
  • leak (float, optional) – Description
  • name (str, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.residual_block(x, n_channels=128, kernel_size=3, scope=None)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • n_channels (int, optional) – Description
  • kernel_size (int, optional) – Description
  • scope (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.cycle_gan.train(ds_X, ds_Y, ckpt_path='cycle_gan', learning_rate=0.0002, n_epochs=100, img_size=256)[source]

Summary

Parameters:
  • ds_X (TYPE) – Description
  • ds_Y (TYPE) – Description
  • ckpt_path (str, optional) – Description
  • learning_rate (float, optional) – Description
  • n_epochs (int, optional) – Description
  • img_size (int, optional) – Description
cadl.cycle_gan.transform(x, img_size=256)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • img_size (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.dataset_utils module

Utils for creating datasets.

class cadl.dataset_utils.Dataset(Xs, ys=None, split=[1.0, 0.0, 0.0], one_hot=False, n_classes=1)[source]

Bases: object

Create a dataset from data and their labels.

Allows easy use of train/valid/test splits; Batch generator.

all_idxs

list – All indexes across all splits.

all_inputs

list – All inputs across all splits.

all_labels

list – All labels across all splits.

n_classes

int – Number of labels.

split

list – Percentage split of train, valid, test sets.

test_idxs

list – Indexes of the test split.

train_idxs

list – Indexes of the train split.

valid_idxs

list – Indexes of the valid split.

X

Inputs/Xs/Images.

Returns:all_inputs – Original Inputs/Xs.
Return type:np.ndarray
Y

Outputs/ys/Labels.

Returns:all_labels – Original Outputs/ys.
Return type:np.ndarray
mean()[source]

Mean of the inputs/Xs.

Returns:mean – Calculates mean across 0th (batch) dimension.
Return type:np.ndarray
std()[source]

Standard deviation of the inputs/Xs.

Returns:std – Calculates std across 0th (batch) dimension.
Return type:np.ndarray
test

Test split.

Returns:split – Split of the test dataset.
Return type:DatasetSplit
train

Train split.

Returns:split – Split of the train dataset.
Return type:DatasetSplit
valid

Validation split.

Returns:split – Split of the validation dataset.
Return type:DatasetSplit
class cadl.dataset_utils.DatasetSplit(images, labels)[source]

Bases: object

Utility class for batching data and handling multiple splits.

current_batch_idx

int – Description

images

np.ndarray – Xs of the dataset. Not necessarily images.

labels

np.ndarray – ys of the dataset.

n_classes

int – Number of possible labels

num_examples

int – Number of total observations

next_batch(batch_size=100)[source]

Batch generator with randomization.

Parameters:batch_size (int, optional) – Size of each minibatch.
Yields:Xs, ys (np.ndarray, np.ndarray) – Next batch of inputs and labels (if no labels, then None).
cadl.dataset_utils.cifar10_download(dst='cifar10')[source]

Download the CIFAR10 dataset.

Parameters:dst (str, optional) – Directory to download into.
cadl.dataset_utils.cifar10_load(dst='cifar10')[source]

Load the CIFAR10 dataset.

Downloads the dataset if it does not exist into the dst directory.

Parameters:dst (str, optional) – Location of CIFAR10 dataset.
Returns:Xs, ys – Array of data, Array of labels
Return type:np.ndarray, np.ndarray
cadl.dataset_utils.create_input_pipeline(files, batch_size, n_epochs, shape, crop_shape=None, crop_factor=1.0, n_threads=2)[source]

Creates a pipefile from a list of image files. Includes batch generator/central crop/resizing options. The resulting generator will dequeue the images batch_size at a time until it throws tf.errors.OutOfRangeError when there are no more images left in the queue.

Parameters:
  • files (list) – List of paths to image files.
  • batch_size (int) – Number of image files to load at a time.
  • n_epochs (int) – Number of epochs to run before raising tf.errors.OutOfRangeError
  • shape (list) – [height, width, channels]
  • crop_shape (list) – [height, width] to crop image to.
  • crop_factor (float) – Percentage of image to take starting from center.
  • n_threads (int, optional) – Number of threads to use for batch shuffling
Returns:

Description

Return type:

TYPE

cadl.dataset_utils.dense_to_one_hot(labels, n_classes=2)[source]

Convert class labels from scalars to one-hot vectors.

Parameters:
  • labels (array) – Input labels to convert to one-hot representation.
  • n_classes (int, optional) – Number of possible one-hot.
Returns:

one_hot – One hot representation of input.

Return type:

array

cadl.dataset_utils.gtzan_music_speech_download(dst='gtzan_music_speech')[source]

Download the GTZAN music and speech dataset.

Parameters:dst (str, optional) – Location to put the GTZAN music and speech datset.
cadl.dataset_utils.gtzan_music_speech_load(dst='gtzan_music_speech')[source]

Load the GTZAN Music and Speech dataset.

Downloads the dataset if it does not exist into the dst directory.

Parameters:dst (str, optional) – Location of GTZAN Music and Speech dataset.
Returns:Xs, ys – Array of data, Array of labels
Return type:np.ndarray, np.ndarray
cadl.dataset_utils.tiny_imagenet_download(dst='tiny_imagenet')[source]

Download the Tiny ImageNet dataset.

Parameters:dst (str, optional) – Directory to download into.
cadl.dataset_utils.tiny_imagenet_load(dst='tiny_imagenet')[source]

Loads the paths to every file in the Tiny Imagenet Dataset.

Downloads the dataset if it does not exist into the dst directory.

Parameters:dst (str, optional) – Location of Tiny ImageNet dataset.
Returns:all_files – List of paths to every file in the Tiny ImageNet Dataset
Return type:list

cadl.datasets module

Utils for loading common datasets.

cadl.datasets.CELEB(path='./img_align_celeba/')[source]

Attempt to load the files of the CELEB dataset.

Requires the files already be downloaded and placed in the dst directory. The first 100 files can be downloaded from the cadl.utils function get_celeb_files

http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html

Parameters:path (str, optional) – Directory where the aligned/cropped celeb dataset can be found.
Returns:files – List of file paths to the dataset.
Return type:list
cadl.datasets.CIFAR10(flatten=True, split=[1.0, 0.0, 0.0])[source]

Returns the CIFAR10 dataset.

Parameters:
  • flatten (bool, optional) – Convert the 3 x 32 x 32 pixels to a single vector
  • split (list, optional) – Description
Returns:

cifar – Description

Return type:

Dataset

cadl.datasets.GTZAN(path='./gtzan_music_speech')[source]

Load the GTZAN Music and Speech dataset.

Downloads the dataset if it does not exist into the dst directory.

Parameters:path (str, optional) – Description
Returns:
  • ds (Dataset) – Dataset object with array of data in X and array of labels in Y
  • Deleted Parameters
  • ——————
  • dst (str, optional) – Location of GTZAN Music and Speech dataset.
cadl.datasets.MNIST(one_hot=True, split=[1.0, 0.0, 0.0])[source]

Returns the MNIST dataset.

Returns:

mnist – DataSet object w/ convenienve props for accessing train/validation/test sets and batches.

Return type:

DataSet

Parameters:
  • one_hot (bool, optional) – Description
  • split (list, optional) – Description
cadl.datasets.TINYIMAGENET(path='./tiny_imagenet/')[source]

Attempt to load the files of the Tiny ImageNet dataset.

http://cs231n.stanford.edu/tiny-imagenet-200.zip https://tiny-imagenet.herokuapp.com/

Parameters:path (str, optional) – Directory where the dataset can be found or else will be placed.
Returns:
  • files (list) – List of file paths to the dataset.
  • labels (list) – List of labels for each file (only training files have labels)

cadl.deepdream module

Deep Dream using the Inception v5 network.

cadl.deepdream.deep_dream(input_img, downsize=False, model='inception', layer_i=-1, neuron_i=-1, n_iterations=100, save_gif=None, save_images='imgs', device='/cpu:0', **kwargs)[source]

Deep Dream with the given parameters.

Parameters:
  • input_img (np.ndarray) – Image to apply deep dream to. Should be 3-dimenionsal H x W x C RGB uint8 or float32.
  • downsize (bool, optional) – Whether or not to downsize the image. Only applies to model==’inception’.
  • model (str, optional) – Which model to load. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’.
  • layer_i (int, optional) – Which layer to use for finding the gradient. E.g. the softmax layer for inception is -1, for vgg networks it is -2. Use the function “get_layer_names” to find the layer number that you need.
  • neuron_i (int, optional) – Which neuron to use. -1 for the entire layer.
  • n_iterations (int, optional) – Number of iterations to dream.
  • save_gif (bool, optional) – Save a GIF.
  • save_images (str, optional) – Folder to save images to.
  • device (str, optional) – Which device to use, e.g. [‘/cpu:0’] or ‘/gpu:0’.
  • **kwargs (dict) – See “_apply” for additional parameters.
Returns:

imgs – Images of every iteration

Return type:

list of np.array

cadl.deepdream.get_labels(model='inception')[source]

Return labels corresponding to the neuron_i parameter of deep dream.

Parameters:model (str, optional) – Which model to load. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’.
Raises:ValueError – Unknown model. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’.
Returns:Description
Return type:TYPE
cadl.deepdream.get_layer_names(model='inception')[source]

Retun every layer’s index and name in the given model.

Parameters:model (str, optional) – Which model to load. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’.
Returns:names – The index and layer’s name for every layer in the given model.
Return type:list of tuples
cadl.deepdream.guided_dream(input_img, guide_img=None, downsize=False, layers=[162, 183, 184, 247], label_i=962, layer_i=-1, feature_loss_weight=1.0, tv_loss_weight=1.0, l2_loss_weight=1.0, softmax_loss_weight=1.0, model='inception', neuron_i=920, n_iterations=100, save_gif=None, save_images='imgs', device='/cpu:0', **kwargs)[source]

Deep Dream v2. Use an optional guide image and other techniques.

Parameters:
  • input_img (np.ndarray) – Image to apply deep dream to. Should be 3-dimenionsal H x W x C RGB uint8 or float32.
  • guide_img (np.ndarray, optional) – Optional image to find features at different layers for. Must pass in a list of layers that you want to find features for. Then the guided dream will try to match this images features at those layers.
  • downsize (bool, optional) – Whether or not to downsize the image. Only applies to model==’inception’.
  • layers (list, optional) – A list of layers to find features for in the “guide_img”.
  • label_i (int, optional) – Which label to use for the softmax layer. Use the “get_labels” function to find the index corresponding the object of interest. If None, not used.
  • layer_i (int, optional) – Which layer to use for finding the gradient. E.g. the softmax layer for inception is -1, for vgg networks it is -2. Use the function “get_layer_names” to find the layer number that you need.
  • feature_loss_weight (float, optional) – Weighting for the feature loss from the guide_img.
  • tv_loss_weight (float, optional) – Total variational loss weighting. Enforces smoothness.
  • l2_loss_weight (float, optional) – L2 loss weighting. Enforces smaller values and reduces saturation.
  • softmax_loss_weight (float, optional) – Softmax loss weighting. Must set label_i.
  • model (str, optional) – Which model to load. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’.
  • neuron_i (int, optional) – Which neuron to use. -1 for the entire layer.
  • n_iterations (int, optional) – Number of iterations to dream.
  • save_gif (bool, optional) – Save a GIF.
  • save_images (str, optional) – Folder to save images to.
  • device (str, optional) – Which device to use, e.g. [‘/cpu:0’] or ‘/gpu:0’.
  • **kwargs (dict) – See “_apply” for additional parameters.
Returns:

imgs – Images of the dream.

Return type:

list of np.ndarray

cadl.dft module

Utils for performing a DFT using numpy.

cadl.dft.ctoz(mag, phs)[source]

Summary

Parameters:
  • mag (TYPE) – Description
  • phs (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.dft.dft_np(signal, hop_size=256, fft_size=512)[source]

Summary

Parameters:
  • signal (TYPE) – Description
  • hop_size (int, optional) – Description
  • fft_size (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.dft.idft_np(re, im, hop_size=256, fft_size=512)[source]

Summary

Parameters:
  • re (TYPE) – Description
  • im (TYPE) – Description
  • hop_size (int, optional) – Description
  • fft_size (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.dft.ztoc(re, im)[source]

Summary

Parameters:
  • re (TYPE) – Description
  • im (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.draw module

Deep Recurrent Attentive Writer.

cadl.draw.binary_cross_entropy(t, o, eps=1e-10)[source]

Summary

Parameters:
  • t (TYPE) – Description
  • o (TYPE) – Description
  • eps (float, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.draw.create_attention_map(h_dec, reuse=None)[source]

Summary

Parameters:
  • h_dec (TYPE) – Description
  • reuse (None, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.draw.create_filterbank(g_x, g_y, log_sigma_sq, log_delta, A=28, B=28, C=1, N=12)[source]

summary

Parameters:
  • g_x (TYPE) – Description
  • g_y (TYPE) – Description
  • log_sigma_sq (TYPE) – Description
  • log_delta (TYPE) – Description
  • A (int, optional) – Description
  • B (int, optional) – Description
  • C (int, optional) – Description
  • N (int, optional) – Description
Returns:

  • name (TYPE) – Description
  • Deleted Parameters
  • ——————
  • log_sigma (type) – description

cadl.draw.create_model(A=28, B=28, C=1, T=16, batch_size=100, n_enc=128, n_z=32, n_dec=128, read_n=12, write_n=12)[source]

<FRESHLY_INSERTED>

cadl.draw.decoder(z, rnn, batch_size, state=None, n_dec=64, reuse=None)[source]

Summary

Parameters:
  • z (TYPE) – Description
  • rnn (TYPE) – Description
  • batch_size (TYPE) – Description
  • state (None, optional) – Description
  • n_dec (int, optional) – Description
  • reuse (None, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.draw.encoder(x, rnn, batch_size, state=None, n_enc=64, reuse=None)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • rnn (TYPE) – Description
  • batch_size (TYPE) – Description
  • state (None, optional) – Description
  • n_enc (int, optional) – Description
  • reuse (None, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.draw.filter_image(x, F_x, F_y, log_gamma, A, B, C, N, inverted=False)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • F_x (TYPE) – Description
  • F_y (TYPE) – Description
  • log_gamma (TYPE) – Description
  • A (TYPE) – Description
  • B (TYPE) – Description
  • C (TYPE) – Description
  • N (TYPE) – Description
  • inverted (bool, optional) – Description
Returns:

  • name (TYPE) – Description
  • Deleted Parameters
  • ——————
  • gamma (TYPE) – Description

cadl.draw.linear(x, n_output)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • n_output (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.draw.read(x_t, x_hat_t, h_dec_t, read_n=5, A=28, B=28, C=1, use_attention=True, reuse=None)[source]

Read from the input image, x, and reconstruction error image x_hat.

Optionally apply a filterbank w/ use_attention.

Parameters:
  • x_t (tf.Tensor) – Input image to optionally filter
  • x_hat_t (tf.Tensor) – Reconstruction error to optionally filter
  • h_dec_t (tf.Tensor) – Output of the decoder of the network (could also be the encoder but the authors suggest to use the decoder instead, see end of section 2.1)
  • read_n (int, optional) – Description
  • A (int, optional) – Description
  • B (int, optional) – Description
  • C (int, optional) – Description
  • use_attention (bool, optional) – Description
  • reuse (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.draw.test_mnist()[source]
cadl.draw.train_dataset(ds, A, B, C, T=20, n_enc=512, n_z=200, n_dec=512, read_n=12, write_n=12, batch_size=100, n_epochs=100)[source]
cadl.draw.train_input_pipeline(files, A, B, C, T=20, n_enc=512, n_z=256, n_dec=512, read_n=15, write_n=15, batch_size=64, n_epochs=1000000000.0, input_shape=(64, 64, 3))[source]
cadl.draw.variational_layer(h_enc, noise, n_z=2, reuse=None)[source]

Summary

Parameters:
  • h_enc (TYPE) – Description
  • noise (TYPE) – Description
  • n_z (int, optional) – Description
  • reuse (None, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.draw.write(h_dec_t, write_n=5, A=28, B=28, C=1, use_attention=True, reuse=None)[source]

Summary

Parameters:
  • h_dec_t (TYPE) – Description
  • write_n (int, optional) – Description
  • A (int, optional) – Description
  • B (int, optional) – Description
  • C (int, optional) – Description
  • use_attention (bool, optional) – Description
  • reuse (None, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.fastwavenet module

WaveNet Training and Fast WaveNet Decoding.

From the following paper

Ramachandran, P., Le Paine, T., Khorrami, P., Babaeizadeh, M., Chang, S., Zhang, Y., … Huang, T. (2017). Fast Generation For Convolutional Autoregressive Models, 1–5.

cadl.fastwavenet.create_generation_model(n_stages=5, n_layers_per_stage=10, n_hidden=256, batch_size=1, n_skip=128, n_quantization=256, filter_length=2, onehot=False)[source]

Summary

Parameters:
  • n_stages (int, optional) – Description
  • n_layers_per_stage (int, optional) – Description
  • n_hidden (int, optional) – Description
  • batch_size (int, optional) – Description
  • n_skip (int, optional) – Description
  • n_quantization (int, optional) – Description
  • filter_length (int, optional) – Description
  • onehot (bool, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.fastwavenet.get_sequence_length(n_stages, n_layers_per_stage)[source]

Summary

Parameters:
  • n_stages (TYPE) – Description
  • n_layers_per_stage (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.fastwavenet.test_librispeech()[source]

Summary

cadl.gan module

Generative Adversarial Network.

cadl.gan.GAN(input_shape, n_latent, n_features, rgb, debug=True)[source]

Summary

Parameters:
  • input_shape (TYPE) – Description
  • n_latent (TYPE) – Description
  • n_features (TYPE) – Description
  • rgb (TYPE) – Description
  • debug (bool, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.gan.decoder(z, dimensions=[], channels=[], filter_sizes=[], convolutional=False, activation=<function relu>, output_activation=<function tanh>, reuse=None)[source]

Decoder network codes input x to layers defined by dimensions.

In contrast with encoder, this requires information on the number of output channels in each layer for convolution. Otherwise, it is mostly the same.

Parameters:
  • z (tf.Tensor) – Input to the decoder network, e.g. tf.Placeholder or tf.Variable
  • dimensions (list, optional) – List of the number of neurons in each layer (convolutional=False) -or- List of the number of filters in each layer (convolutional=True), e.g. [100, 100, 100, 100] for a 4-layer deep network with 100 in each layer.
  • channels (list, optional) – For decoding when convolutional=True, require the number of output channels in each layer.
  • filter_sizes (list, optional) – List of the size of the kernel in each layer, e.g.: [3, 3, 3, 3] is a 4-layer deep network w/ 3 x 3 kernels in every layer.
  • convolutional (bool, optional) – Whether or not to use convolutional layers.
  • activation (fn, optional) – Function for applying an activation, e.g. tf.nn.relu
  • output_activation (fn, optional) – Function for applying an activation on the last layer, e.g. tf.nn.relu
  • reuse (bool, optional) – For each layer’s variable scope, whether to reuse existing variables.
Returns:

h – Output tensor of the decoder

Return type:

tf.Tensor

cadl.gan.discriminator(x, convolutional=True, n_features=32, rgb=False, reuse=False)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • convolutional (bool, optional) – Description
  • n_features (int, optional) – Description
  • rgb (bool, optional) – Description
  • reuse (bool, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.gan.encoder(x, dimensions=[], filter_sizes=[], convolutional=False, activation=<function relu>, output_activation=<function sigmoid>, reuse=False)[source]

Encoder network codes input x to layers defined by dimensions.

Parameters:
  • x (tf.Tensor) – Input to the encoder network, e.g. tf.Placeholder or tf.Variable
  • dimensions (list, optional) – List of the number of neurons in each layer (convolutional=False) -or- List of the number of filters in each layer (convolutional=True), e.g. [100, 100, 100, 100] for a 4-layer deep network with 100 in each layer.
  • filter_sizes (list, optional) – List of the size of the kernel in each layer, e.g.: [3, 3, 3, 3] is a 4-layer deep network w/ 3 x 3 kernels in every layer.
  • convolutional (bool, optional) – Whether or not to use convolutional layers.
  • activation (fn, optional) – Function for applying an activation, e.g. tf.nn.relu
  • output_activation (fn, optional) – Function for applying an activation on the last layer, e.g. tf.nn.relu
  • reuse (bool, optional) – For each layer’s variable scope, whether to reuse existing variables.
Returns:

h – Output tensor of the encoder

Return type:

tf.Tensor

cadl.gan.generator(z, output_h, output_w, convolutional=True, n_features=32, rgb=False, reuse=None)[source]

Simple interface to build a decoder network given the input parameters.

Parameters:
  • z (tf.Tensor) – Input to the generator, i.e. tf.Placeholder of tf.Variable
  • output_h (int) – Final generated height
  • output_w (int) – Final generated width
  • convolutional (bool, optional) – Whether or not to build a convolutional generative network.
  • n_features (int, optional) – Number of channels to use in the last hidden layer.
  • rgb (bool, optional) – Whether or not the final generated image is RGB or not.
  • reuse (None, optional) – Whether or not to reuse the variables if they are already created.
Returns:

x_tilde – Output of the generator network.

Return type:

tf.Tensor

cadl.gan.train_input_pipeline(files, init_lr_g=0.0001, init_lr_d=0.0001, n_features=10, n_latent=100, n_epochs=1000000, batch_size=200, n_samples=15, input_shape=[218, 178, 3], crop_shape=[64, 64, 3], crop_factor=0.8)[source]

Summary

Parameters:
  • files (TYPE) – Description
  • init_lr_g (float, optional) – Description
  • init_lr_d (float, optional) – Description
  • n_features (int, optional) – Description
  • n_latent (int, optional) – Description
  • n_epochs (int, optional) – Description
  • batch_size (int, optional) – Description
  • n_samples (int, optional) – Description
  • input_shape (list, optional) – Description
  • crop_shape (list, optional) – Description
  • crop_factor (float, optional) – Description
  • Longer Returned (No) –
  • ------------------
  • name (TYPE) – Description

cadl.gif module

Utility for creating a GIF.

cadl.gif.build_gif(imgs, interval=0.1, dpi=72, save_gif=True, saveto='animation.gif', show_gif=False, cmap=None)[source]

Take an array or list of images and create a GIF.

Parameters:
  • imgs (np.ndarray or list) – List of images to create a GIF of
  • interval (float, optional) – Spacing in seconds between successive images.
  • dpi (int, optional) – Dots per inch.
  • save_gif (bool, optional) – Whether or not to save the GIF.
  • saveto (str, optional) – Filename of GIF to save.
  • show_gif (bool, optional) – Whether or not to render the GIF using plt.
  • cmap (None, optional) – Optional colormap to apply to the images.
Returns:

ani – The artist animation from matplotlib. Likely not useful.

Return type:

matplotlib.animation.ArtistAnimation

cadl.glove module

Global Vector Embeddings.

cadl.glove.course_example()[source]

Summary

cadl.glove.get_model()[source]

Summary

Returns:Description
Return type:TYPE

cadl.i2v module

Illustration2Vec model and preprocessing.

cadl.i2v.deprocess(img)[source]

Summary

Parameters:img (TYPE) – Description
Returns:Description
Return type:TYPE
cadl.i2v.get_i2v_model()[source]

Get a pretrained i2v network.

Returns:net – {‘graph_def’: graph_def, ‘labels’: synsets} where the graph_def is a tf.GraphDef and the synsets map an integer label from 0-1000 to a list of names
Return type:dict
cadl.i2v.get_i2v_tag_model()[source]

Get a pretrained i2v tag network.

Returns:net – {‘graph_def’: graph_def, ‘labels’: synsets} where the graph_def is a tf.GraphDef and the synsets map an integer label from 0-1000 to a list of names
Return type:dict
cadl.i2v.i2v_download()[source]

Download a pretrained i2v network.

Returns:Description
Return type:TYPE
cadl.i2v.i2v_tag_download()[source]

Download a pretrained i2v network.

Returns:Description
Return type:TYPE
cadl.i2v.preprocess(img, crop=True, resize=True, dsize=(224, 224))[source]

Summary

Parameters:
  • img (TYPE) – Description
  • crop (bool, optional) – Description
  • resize (bool, optional) – Description
  • dsize (tuple, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.i2v.test_i2v()[source]

Loads the i2v network and applies it to a test image.

cadl.inception module

Inception model, download, and preprocessing.

cadl.inception.deprocess(img)[source]

Summary

Parameters:img (TYPE) – Description
Returns:Description
Return type:TYPE
cadl.inception.get_inception_model(data_dir='inception', version='v5')[source]

Get a pretrained inception network.

Parameters:
  • data_dir (str, optional) – Location of the pretrained inception network download.
  • version (str, optional) – Version of the model: [‘v3’] or ‘v5’.
Returns:

net – {‘graph_def’: graph_def, ‘labels’: synsets} where the graph_def is a tf.GraphDef and the synsets map an integer label from 0-1000 to a list of names

Return type:

dict

cadl.inception.inception_download(data_dir='inception', version='v5')[source]

Download a pretrained inception network.

Parameters:
  • data_dir (str, optional) – Location of the pretrained inception network download.
  • version (str, optional) – Version of the model: [‘v3’] or ‘v5’.
Returns:

Description

Return type:

TYPE

cadl.inception.preprocess(img, crop=True, resize=True, dsize=(299, 299))[source]

Summary

Parameters:
  • img (TYPE) – Description
  • crop (bool, optional) – Description
  • resize (bool, optional) – Description
  • dsize (tuple, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.inception.test_inception()[source]

Loads the inception network and applies it to a test image.

cadl.librispeech module

LibriSpeech dataset, batch processing, and preprocessing.

cadl.librispeech.batch_generator(dataset, batch_size=32, max_sequence_length=6144, maxval=32768.0, threshold=0.2, normalize=True)[source]

Summary

Parameters:
  • dataset (TYPE) – Description
  • batch_size (int, optional) – Description
  • max_sequence_length (int, optional) – Description
  • maxval (float, optional) – Description
  • threshold (float, optional) – Description
  • normalize (bool, optional) – Description
Yields:

TYPE – Description

cadl.librispeech.get_dataset(saveto='librispeech', convert_to_wav=False, kind='dev')[source]

Download the LibriSpeech dataset and convert to wav files.

More info: http://www.openslr.org/12/

This interface downloads the LibriSpeech dataset and attempts to convert the flac to wave files using ffmpeg. If you do not have ffmpeg installed, this function will not be able to convert the files to waves.

Parameters:
  • saveto (str) – Directory to save the resulting dataset [‘librispeech’]
  • convert_to_wav (bool, optional) – Description
  • kind (str, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.magenta_utils module

cadl.nb_utils module

Utility for displaying Tensorflow graphs.

cadl.nb_utils.show_graph(graph_def)[source]

Summary

Parameters:graph_def (TYPE) – Description

cadl.nsynth module

NSynth: WaveNet Autoencoder.

class cadl.nsynth.Config(encoding, train_path=None)[source]

Bases: object

Configuration object that helps manage the graph.

ae_bottleneck_width

int – Description

ae_hop_length

int – Description

encoding

TYPE – Description

learning_rate_schedule

TYPE – Description

num_iters

int – Description

train_path

TYPE – Description

build(inputs, is_training)[source]

Build the graph for this configuration.

Parameters:
  • inputs – A dict of inputs. For training, should contain ‘wav’.
  • is_training – Whether we are training or not. Not used in this config.
Returns:

  • A dict of outputs that includes the ‘predictions’, ‘loss’, the ‘encoding’,
  • the ‘quantized_input’, and whatever metrics we want to track for eval.

get_batch(batch_size)[source]

Summary

Parameters:batch_size (TYPE) – Description
Returns:Description
Return type:TYPE
class cadl.nsynth.FastGenerationConfig[source]

Bases: object

Configuration object that helps manage the graph.

build(inputs)[source]

Build the graph for this configuration.

Parameters:inputs – A dict of inputs. For training, should contain ‘wav’.
Returns:
  • A dict of outputs that includes the ‘predictions’, ‘loss’, the ‘encoding’,
  • the ‘quantized_input’, and whatever metrics we want to track for eval.
  • Deleted Parameters
  • ——————
  • is_training – Whether we are training or not. Not used in this config.
cadl.nsynth.causal_linear(x, n_inputs, n_outputs, name, filter_length, rate, batch_size)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • n_inputs (TYPE) – Description
  • n_outputs (TYPE) – Description
  • name (TYPE) – Description
  • filter_length (TYPE) – Description
  • rate (TYPE) – Description
  • batch_size (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.nsynth.get_model()[source]

Summary

cadl.nsynth.inv_mu_law(x, mu=255.0)[source]

A TF implementation of inverse Mu-Law.

Parameters:
  • x – The Mu-Law samples to decode.
  • mu – The Mu we used to encode these samples.
Returns:

The decoded data.

Return type:

out

cadl.nsynth.linear(x, n_inputs, n_outputs, name)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • n_inputs (TYPE) – Description
  • n_outputs (TYPE) – Description
  • name (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.nsynth.load_audio(wav_file, sample_length=64000)[source]

Summary

Parameters:
  • wav_file (TYPE) – Description
  • sample_length (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.nsynth.load_fastgen_nsynth(batch_size=1, sample_length=64000)[source]

Summary

Parameters:
  • batch_size (int, optional) – Description
  • sample_length (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.nsynth.load_nsynth(encoding=True, batch_size=1, sample_length=64000)[source]

Summary

Parameters:
  • encoding (bool, optional) – Description
  • batch_size (int, optional) – Description
  • sample_length (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.nsynth.synthesize(wav_file, out_file='synthesis.wav', sample_length=64000, synth_length=16000, ckpt_path='./model.ckpt-200000', resample_encoding=False)[source]

Summary

Parameters:
  • wav_file (TYPE) – Description
  • out_file (str, optional) – Description
  • sample_length (int, optional) – Description
  • synth_length (int, optional) – Description
  • ckpt_path (str, optional) – Description
  • resample_encoding (bool, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.pixelcnn module

Conditional Gated Pixel CNN.

cadl.pixelcnn.build_conditional_pixel_cnn_model(B=None, H=32, W=32, C=3, n_conditionals=None)[source]

Conditional Gated Pixel CNN Model.

van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., & Kavukcuoglu, K. (2016). Conditional Image Generation with PixelCNN Decoders.

Implements most of the paper, except for the autoencoder, triplet loss of face embeddings, and pad/crop/shift ops for convolution (as it is not as clear IMO from a pedagogical point of view).

Parameters:
  • B (None, optional) – Description
  • H (int, optional) – Description
  • W (int, optional) – Description
  • C (int, optional) – Description
  • n_conditionals (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.pixelcnn.gated_conv2d(X, K_h, K_w, K_c, strides=[1, 1, 1, 1], padding='SAME', mask=None, cond_h=None, vertical_h=None)[source]

Summary

Parameters:
  • X (TYPE) – Description
  • K_h (TYPE) – Description
  • K_w (TYPE) – Description
  • K_c (TYPE) – Description
  • strides (list, optional) – Description
  • padding (str, optional) – Description
  • mask (None, optional) – Description
  • cond_h (None, optional) – Description
  • vertical_h (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.pixelcnn.generate()[source]

Summary

cadl.pixelcnn.train_tiny_imagenet(ckpt_path='pixelcnn', n_epochs=1000, save_step=100, write_step=25, B=32, H=64, W=64, C=3)[source]

Summary

Parameters:
  • ckpt_path (str, optional) – Description
  • n_epochs (int, optional) – Description
  • save_step (int, optional) – Description
  • write_step (int, optional) – Description
  • B (int, optional) – Description
  • H (int, optional) – Description
  • W (int, optional) – Description
  • C (int, optional) – Description

cadl.pixelrnn module

Basic PixelRNN i.e. CharRNN style, none of the fancy ones (i.e. Row, Diag, BiDiag).

cadl.pixelrnn.build_pixel_rnn_basic_model(B=50, H=32, W=32, C=32, n_units=100, n_layers=2)[source]

Summary

Parameters:
  • B (int, optional) – Description
  • H (int, optional) – Description
  • W (int, optional) – Description
  • C (int, optional) – Description
  • n_units (int, optional) – Description
  • n_layers (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.pixelrnn.infer(sess, net, H, W, C, pixel_value=128, state=None)[source]

Summary

Parameters:
  • sess (TYPE) – Description
  • net (TYPE) – Description
  • H (TYPE) – Description
  • W (TYPE) – Description
  • C (TYPE) – Description
  • pixel_value (int, optional) – Description
  • state (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.pixelrnn.train_tiny_imagenet()[source]

Summary

cadl.seq2seq module

Sequence to Sequence models w/ Attention and BiDirectional Dynamic RNNs.

cadl.seq2seq.batch_generator(sources, targets, source_lengths, target_lengths, batch_size=50)[source]

Summary

Parameters:
  • sources (TYPE) – Description
  • targets (TYPE) – Description
  • source_lengths (TYPE) – Description
  • target_lengths (TYPE) – Description
  • batch_size (int, optional) – Description
Yields:

TYPE – Description

cadl.seq2seq.create_model(source_vocab_size=10000, target_vocab_size=10000, input_embed_size=512, target_embed_size=512, share_input_and_target_embedding=True, n_neurons=512, n_layers=4, use_attention=True, max_sequence_size=30)[source]

Summary

Parameters:
  • source_vocab_size (int, optional) – Description
  • target_vocab_size (int, optional) – Description
  • input_embed_size (int, optional) – Description
  • target_embed_size (int, optional) – Description
  • share_input_and_target_embedding (bool, optional) – Description
  • n_neurons (int, optional) – Description
  • n_layers (int, optional) – Description
  • use_attention (bool, optional) – Description
  • max_sequence_size (int, optional) – Description
Returns:

Description

Return type:

TYPE

Raises:

ValueError – Description

cadl.seq2seq.id2word(ids, vocab)[source]

Summary

Parameters:
  • ids (TYPE) – Description
  • vocab (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.seq2seq.preprocess(text, min_count=5, min_length=3, max_length=30)[source]

Summary

Parameters:
  • text (TYPE) – Description
  • min_count (int, optional) – Description
  • min_length (int, optional) – Description
  • max_length (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.seq2seq.train(text, max_sequence_size=20, use_attention=True, min_count=25, min_length=5, n_epochs=1000, batch_size=100)[source]

Summary

Parameters:
  • text (TYPE) – Description
  • max_sequence_size (int, optional) – Description
  • use_attention (bool, optional) – Description
  • min_count (int, optional) – Description
  • min_length (int, optional) – Description
  • n_epochs (int, optional) – Description
  • batch_size (int, optional) – Description
cadl.seq2seq.train_cornell(**kwargs)[source]

Summary

Parameters:**kwargs – Description
Returns:Description
Return type:TYPE
cadl.seq2seq.word2id(words, vocab)[source]

Summary

Parameters:
  • words (TYPE) – Description
  • vocab (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.squeezenet module

SqueezeNet

cadl.squeezenet.fire_module(input, fire_id, channel, s1, e1, e3)[source]
Basic module that makes up the SqueezeNet architecture. It has two layers.
  1. Squeeze layer (1x1 convolutions)
  2. Expand layer (1x1 and 3x3 convolutions)
Parameters:
  • input (TYPE) – Tensorflow tensor
  • fire_id (TYPE) – Variable scope name
  • channel (TYPE) – Depth of the previous output
  • s1 (TYPE) – Number of filters for squeeze 1x1 layer
  • e1 (TYPE) – Number of filters for expand 1x1 layer
  • e3 (TYPE) – Number of filters for expand 3x3 layer
  • input – Description
  • fire_id – Description
  • channel – Description
  • s1 – Description
  • e1 – Description
  • e3 – Description
Returns:

Tensorflow tensor

Returns:

Description

Return type:

TYPE

cadl.squeezenet.squeeze_net(input, classes)[source]

SqueezeNet model written in tensorflow. It provides AlexNet level accuracy with 50x fewer parameters and smaller model size. :param input: Input tensor (4D) :param classes: number of classes for classification :return: Tensorflow tensor

Parameters:
  • input (TYPE) – Description
  • classes (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.stats module

cadl.stylenet module

Style Net w/ tests for Video Style Net.

cadl.stylenet.make_4d(img)[source]

Create a 4-dimensional N x H x W x C image.

Parameters:img (np.ndarray) – Given image as H x W x C or H x W.
Returns:img – N x H x W x C image.
Return type:np.ndarray
Raises:ValueError – Unexpected number of dimensions.
cadl.stylenet.stylize(content_img, style_img, base_img=None, saveto=None, gif_step=5, n_iterations=100, style_weight=1.0, content_weight=1.0)[source]

Stylization w/ the given content and style images.

Follows the approach in Leon Gatys et al.

Parameters:
  • content_img (np.ndarray) – Image to use for finding the content features.
  • style_img (TYPE) – Image to use for finding the style features.
  • base_img (None, optional) – Image to use for the base content. Can be noise or an existing image. If None, the content image will be used.
  • saveto (str, optional) – Name of GIF image to write to, e.g. “stylization.gif”
  • gif_step (int, optional) – Modulo of iterations to save the current stylization.
  • n_iterations (int, optional) – Number of iterations to run for.
  • style_weight (float, optional) – Weighting on the style features.
  • content_weight (float, optional) – Weighting on the content features.
Returns:

stylization – Final iteration of the stylization.

Return type:

np.ndarray

cadl.stylenet.test()[source]

Test for artistic stylization.

cadl.stylenet.test_video(style_img='arles.jpg', videodir='kurosawa')[source]
cadl.stylenet.warp_img(img, dx, dy)[source]

Apply the motion vectors to the given image.

Parameters:
  • img (np.ndarray) – Input image to apply motion to.
  • dx (np.ndarray) – H x W matrix defining the magnitude of the X vector
  • dy (np.ndarray) – H x W matrix defining the magnitude of the Y vector
Returns:

img – Image with pixels warped according to dx, dy.

Return type:

np.ndarray

cadl.tedlium module

TEDLium Dataset.

cadl.tedlium.get_dataset()[source]

Summary

Returns:Description
Return type:TYPE

cadl.utils module

Various utilities including downloading, common layers, etc..

cadl.utils.bias_variable(shape, **kwargs)[source]

Helper function to create a bias variable initialized with a constant value.

Parameters:
  • shape (list) – Size of weight variable
  • **kwargs – Description
Returns:

Description

Return type:

TYPE

cadl.utils.binary_cross_entropy(z, x, name=None)[source]

Binary Cross Entropy measures cross entropy of a binary variable.

loss(x, z) = - sum_i (x[i] * log(z[i]) + (1 - x[i]) * log(1 - z[i]))

Parameters:
  • z (tf.Tensor) – A Tensor of the same type and shape as x.
  • x (tf.Tensor) – A Tensor of type float32 or float64.
  • name (None, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.utils.build_submission(filename, file_list, optional_file_list=())[source]

Helper utility to check homework assignment submissions and package them.

Parameters:
  • filename (str) – Output zip file name
  • file_list (tuple) – Tuple of files to include
  • optional_file_list (tuple, optional) – Description
cadl.utils.conv2d(x, n_output, k_h=5, k_w=5, d_h=2, d_w=2, padding='SAME', name='conv2d', reuse=None)[source]

Helper for creating a 2d convolution operation.

Parameters:
  • x (tf.Tensor) – Input tensor to convolve.
  • n_output (int) – Number of filters.
  • k_h (int, optional) – Kernel height
  • k_w (int, optional) – Kernel width
  • d_h (int, optional) – Height stride
  • d_w (int, optional) – Width stride
  • padding (str, optional) – Padding type: “SAME” or “VALID”
  • name (str, optional) – Variable scope
  • reuse (None, optional) – Description
Returns:

op – Output of convolution

Return type:

tf.Tensor

cadl.utils.convolve(img, kernel)[source]

Use Tensorflow to convolve a 4D image with a 4D kernel.

Parameters:
  • img (np.ndarray) – 4-dimensional image shaped N x H x W x C
  • kernel (np.ndarray) – 4-dimensional image shape K_H, K_W, C_I, C_O corresponding to the kernel’s height and width, the number of input channels, and the number of output channels. Note that C_I should = C.
Returns:

result – Convolved result.

Return type:

np.ndarray

cadl.utils.corrupt(x)[source]

Take an input tensor and add uniform masking.

Parameters:x (Tensor/Placeholder) – Input to corrupt.
Returns:x_corrupted – 50 pct of values corrupted.
Return type:Tensor
cadl.utils.deconv2d(x, n_output_h, n_output_w, n_output_ch, n_input_ch=None, k_h=5, k_w=5, d_h=2, d_w=2, padding='SAME', name='deconv2d', reuse=None)[source]

Deconvolution helper.

Parameters:
  • x (tf.Tensor) – Input tensor to convolve.
  • n_output_h (int) – Height of output
  • n_output_w (int) – Width of output
  • n_output_ch (int) – Number of filters.
  • n_input_ch (None, optional) – Description
  • k_h (int, optional) – Kernel height
  • k_w (int, optional) – Kernel width
  • d_h (int, optional) – Height stride
  • d_w (int, optional) – Width stride
  • padding (str, optional) – Padding type: “SAME” or “VALID”
  • name (str, optional) – Variable scope
  • reuse (None, optional) – Description
Returns:

op – Output of deconvolution

Return type:

tf.Tensor

cadl.utils.download(path)[source]

Use urllib to download a file.

Parameters:path (str) – Url to download
Returns:path – Location of downloaded file.
Return type:str
cadl.utils.download_and_extract_tar(path, dst)[source]

Download and extract a tar file.

Parameters:
  • path (str) – Url to tar file to download.
  • dst (str) – Location to save tar file contents.
cadl.utils.download_and_extract_zip(path, dst)[source]

Download and extract a zip file.

Parameters:
  • path (str) – Url to zip file to download.
  • dst (str) – Location to save zip file contents.
cadl.utils.exists(site)[source]

Summary

Parameters:site (TYPE) – Description
Returns:Description
Return type:TYPE
cadl.utils.flatten(x, name=None, reuse=None)[source]

Flatten Tensor to 2-dimensions.

Parameters:
  • x (tf.Tensor) – Input tensor to flatten.
  • name (None, optional) – Variable scope for flatten operations
  • reuse (None, optional) – Description
Returns:

flattened – Flattened tensor.

Return type:

tf.Tensor

Raises:

ValueError – Description

cadl.utils.gabor(ksize=32)[source]

Use Tensorflow to compute a 2D Gabor Kernel.

Parameters:ksize (int, optional) – Size of kernel.
Returns:gabor – Gabor kernel with ksize x ksize dimensions.
Return type:np.ndarray
cadl.utils.gauss(mean, stddev, ksize)[source]

Use Tensorflow to compute a Gaussian Kernel.

Parameters:
  • mean (float) – Mean of the Gaussian (e.g. 0.0).
  • stddev (float) – Standard Deviation of the Gaussian (e.g. 1.0).
  • ksize (int) – Size of kernel (e.g. 16).
Returns:

kernel – Computed Gaussian Kernel using Tensorflow.

Return type:

np.ndarray

cadl.utils.gauss2d(mean, stddev, ksize)[source]

Use Tensorflow to compute a 2D Gaussian Kernel.

Parameters:
  • mean (float) – Mean of the Gaussian (e.g. 0.0).
  • stddev (float) – Standard Deviation of the Gaussian (e.g. 1.0).
  • ksize (int) – Size of kernel (e.g. 16).
Returns:

kernel – Computed 2D Gaussian Kernel using Tensorflow.

Return type:

np.ndarray

cadl.utils.get_celeb_files(dst='img_align_celeba', max_images=100)[source]

Download the first 100 images of the celeb dataset.

Files will be placed in a directory ‘img_align_celeba’ if one doesn’t exist.

Returns:

files – Locations to the first 100 images of the celeb net dataset.

Return type:

list of strings

Parameters:
  • dst (str, optional) – Description
  • max_images (int, optional) – Description
cadl.utils.get_celeb_imgs(max_images=100)[source]

Load the first max_images images of the celeb dataset.

Returns:imgs – List of the first 100 images from the celeb dataset
Return type:list of np.ndarray
Parameters:max_images (int, optional) – Description
cadl.utils.imcrop_tosquare(img)[source]

Make any image a square image.

Parameters:img (np.ndarray) – Input image to crop, assumed at least 2d.
Returns:crop – Cropped image.
Return type:np.ndarray
cadl.utils.interp(l, r, n_samples)[source]

Intepolate between the arrays l and r, n_samples times.

Parameters:
  • l (np.ndarray) – Left edge
  • r (np.ndarray) – Right edge
  • n_samples (int) – Number of samples
Returns:

arr – Inteporalted array

Return type:

np.ndarray

cadl.utils.linear(x, n_output, name=None, activation=None, reuse=None)[source]

Fully connected layer.

Parameters:
  • x (tf.Tensor) – Input tensor to connect
  • n_output (int) – Number of output neurons
  • name (None, optional) – Scope to apply
  • activation (None, optional) – Description
  • reuse (None, optional) – Description
Returns:

h, W – Output of fully connected layer and the weight matrix

Return type:

tf.Tensor, tf.Tensor

cadl.utils.load_audio(filename, b_normalize=True)[source]

Load the audiofile at the provided filename using scipy.io.wavfile.

Optionally normalizes the audio to the maximum value.

Parameters:
  • filename (str) – File to load.
  • b_normalize (bool, optional) – Normalize to the maximum value.
Returns:

Description

Return type:

TYPE

cadl.utils.lrelu(features, leak=0.2)[source]

Leaky rectifier.

Parameters:
  • features (tf.Tensor) – Input to apply leaky rectifier to.
  • leak (float, optional) – Percentage of leak.
Returns:

op – Resulting output of applying leaky rectifier activation.

Return type:

tf.Tensor

cadl.utils.make_latent_manifold(corners, n_samples)[source]

Create a 2d manifold out of the provided corners: n_samples * n_samples.

Parameters:
  • corners (list of np.ndarray) – The four corners to intepolate.
  • n_samples (int) – Number of samples to use in interpolation.
Returns:

arr – Stacked array of all 2D interpolated samples

Return type:

np.ndarray

cadl.utils.montage(images, saveto='montage.png')[source]

Draw all images as a montage separated by 1 pixel borders.

Also saves the file to the destination specified by saveto.

Parameters:
  • images (numpy.ndarray) – Input array to create montage of. Array should be: batch x height x width x channels.
  • saveto (str) – Location to save the resulting montage image.
Returns:

m – Montage image.

Return type:

numpy.ndarray

cadl.utils.montage_filters(W)[source]

Draws all filters (n_input * n_output filters) as a montage image separated by 1 pixel borders.

Parameters:W (Tensor) – Input tensor to create montage of.
Returns:m – Montage image.
Return type:numpy.ndarray
cadl.utils.normalize(a, s=0.1)[source]

Normalize the image range for visualization

Parameters:
  • a (TYPE) – Description
  • s (float, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.utils.sample_categorical(pmf)[source]

Sample from a categorical distribution.

Parameters:pmf – Probablity mass function. Output of a softmax over categories. Array of shape [batch_size, number of categories]. Rows sum to 1.
Returns:Array of size [batch_size, 1]. Integer of category sampled.
Return type:idxs
cadl.utils.slice_montage(montage, img_h, img_w, n_imgs)[source]

Slice a montage image into n_img h x w images.

Performs the opposite of the montage function. Takes a montage image and slices it back into a N x H x W x C image.

Parameters:
  • montage (np.ndarray) – Montage image to slice.
  • img_h (int) – Height of sliced image
  • img_w (int) – Width of sliced image
  • n_imgs (int) – Number of images to slice
Returns:

sliced – Sliced images as 4d array.

Return type:

np.ndarray

cadl.utils.stdout_redirect(where)[source]

Summary

Parameters:where (TYPE) – Description
Yields:TYPE – Description
cadl.utils.to_tensor(x)[source]

Convert 2 dim Tensor to a 4 dim Tensor ready for convolution.

Performs the opposite of flatten(x). If the tensor is already 4-D, this returns the same as the input, leaving it unchanged.

Parameters:x (tf.Tesnor) – Input 2-D tensor. If 4-D already, left unchanged.
Returns:x – 4-D representation of the input.
Return type:tf.Tensor
Raises:ValueError – If the tensor is not 2D or already 4D.
cadl.utils.weight_variable(shape, **kwargs)[source]

Helper function to create a weight variable initialized with a normal distribution

Parameters:
  • shape (list) – Size of weight variable
  • **kwargs – Description
Returns:

Description

Return type:

TYPE

cadl.vae module

Convolutional/Variational autoencoder, including demonstration of training such a network on MNIST, CelebNet and the film, “Sita Sings The Blues” using an image pipeline.

cadl.vae.VAE(input_shape=[None, 784], n_filters=[64, 64, 64], filter_sizes=[4, 4, 4], n_hidden=32, n_code=2, activation=<function tanh>, dropout=False, denoising=False, convolutional=False, variational=False)[source]

(Variational) (Convolutional) (Denoising) Autoencoder.

Uses tied weights.

Parameters:
  • input_shape (list, optional) – Shape of the input to the network. e.g. for MNIST: [None, 784].
  • n_filters (list, optional) – Number of filters for each layer. If convolutional=True, this refers to the total number of output filters to create for each layer, with each layer’s number of output filters as a list. If convolutional=False, then this refers to the total number of neurons for each layer in a fully connected network.
  • filter_sizes (list, optional) – Only applied when convolutional=True. This refers to the ksize (height and width) of each convolutional layer.
  • n_hidden (int, optional) – Only applied when variational=True. This refers to the first fully connected layer prior to the variational embedding, directly after the encoding. After the variational embedding, another fully connected layer is created with the same size prior to decoding. Set to 0 to not use an additional hidden layer.
  • n_code (int, optional) – Only applied when variational=True. This refers to the number of latent Gaussians to sample for creating the inner most encoding.
  • activation (function, optional) – Activation function to apply to each layer, e.g. tf.nn.relu
  • dropout (bool, optional) – Whether or not to apply dropout. If using dropout, you must feed a value for ‘keep_prob’, as returned in the dictionary. 1.0 means no dropout is used. 0.0 means every connection is dropped. Sensible values are between 0.5-0.8.
  • denoising (bool, optional) – Whether or not to apply denoising. If using denoising, you must feed a value for ‘corrupt_prob’, as returned in the dictionary. 1.0 means no corruption is used. 0.0 means every feature is corrupted. Sensible values are between 0.5-0.8.
  • convolutional (bool, optional) – Whether or not to use a convolutional network or else a fully connected network will be created. This effects the n_filters parameter’s meaning.
  • variational (bool, optional) – Whether or not to create a variational embedding layer. This will create a fully connected layer after the encoding, if n_hidden is greater than 0, then will create a multivariate gaussian sampling layer, then another fully connected layer. The size of the fully connected layers are determined by n_hidden, and the size of the sampling layer is determined by n_code.
Returns:

model

{

‘cost’: Tensor to optimize. ‘Ws’: All weights of the encoder. ‘x’: Input Placeholder ‘z’: Inner most encoding Tensor (latent features) ‘y’: Reconstruction of the Decoder ‘keep_prob’: Amount to keep when using Dropout ‘corrupt_prob’: Amount to corrupt when using Denoising ‘train’: Set to True when training/Applies to Batch Normalization.

}

Return type:

dict

cadl.vae.test_celeb()[source]

Train an autoencoder on Celeb Net.

cadl.vae.test_mnist()[source]

Train an autoencoder on MNIST.

This function will train an autoencoder on MNIST and also save many image files during the training process, demonstrating the latent space of the inner most dimension of the encoder, as well as reconstructions of the decoder.

cadl.vae.test_sita()[source]

Train an autoencoder on Sita Sings The Blues.

cadl.vae.train_vae(files, input_shape, learning_rate=0.0001, batch_size=100, n_epochs=50, n_examples=10, crop_shape=[64, 64, 3], crop_factor=0.8, n_filters=[100, 100, 100, 100], n_hidden=256, n_code=50, convolutional=True, variational=True, filter_sizes=[3, 3, 3, 3], dropout=True, keep_prob=0.8, activation=<function relu>, img_step=100, save_step=100, ckpt_name='vae.ckpt')[source]

General purpose training of a (Variational) (Convolutional) Autoencoder.

Supply a list of file paths to images, and this will do everything else.

Parameters:
  • files (list of strings) – List of paths to images.
  • input_shape (list) – Must define what the input image’s shape is.
  • learning_rate (float, optional) – Learning rate.
  • batch_size (int, optional) – Batch size.
  • n_epochs (int, optional) – Number of epochs.
  • n_examples (int, optional) – Number of example to use while demonstrating the current training iteration’s reconstruction. Creates a square montage, so make sure int(sqrt(n_examples))**2 = n_examples, e.g. 16, 25, 36, ... 100.
  • crop_shape (list, optional) – Size to centrally crop the image to.
  • crop_factor (float, optional) – Resize factor to apply before cropping.
  • n_filters (list, optional) – Same as VAE’s n_filters.
  • n_hidden (int, optional) – Same as VAE’s n_hidden.
  • n_code (int, optional) – Same as VAE’s n_code.
  • convolutional (bool, optional) – Use convolution or not.
  • variational (bool, optional) – Use variational layer or not.
  • filter_sizes (list, optional) – Same as VAE’s filter_sizes.
  • dropout (bool, optional) – Use dropout or not
  • keep_prob (float, optional) – Percent of keep for dropout.
  • activation (function, optional) – Which activation function to use.
  • img_step (int, optional) – How often to save training images showing the manifold and reconstruction.
  • save_step (int, optional) – How often to save checkpoints.
  • ckpt_name (str, optional) – Checkpoints will be named as this, e.g. ‘model.ckpt’

cadl.vaegan module

Convolutional/Variational autoencoder, including demonstration of training such a network on MNIST, CelebNet and the film, “Sita Sings The Blues” using an image pipeline.

cadl.vaegan.VAE(input_shape=[None, 784], n_filters=[64, 64, 64], filter_sizes=[4, 4, 4], n_hidden=32, n_code=2, activation=<function tanh>, convolutional=False, variational=False)[source]

Summary

Parameters:
  • input_shape (list, optional) – Description
  • n_filters (list, optional) – Description
  • filter_sizes (list, optional) – Description
  • n_hidden (int, optional) – Description
  • n_code (int, optional) – Description
  • activation (TYPE, optional) – Description
  • convolutional (bool, optional) – Description
  • variational (bool, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.vaegan.VAEGAN(input_shape=[None, 784], n_filters=[64, 64, 64], filter_sizes=[4, 4, 4], n_hidden=32, n_code=2, activation=<function tanh>, convolutional=False, variational=False)[source]

Summary

Parameters:
  • input_shape (list, optional) – Description
  • n_filters (list, optional) – Description
  • filter_sizes (list, optional) – Description
  • n_hidden (int, optional) – Description
  • n_code (int, optional) – Description
  • activation (TYPE, optional) – Description
  • convolutional (bool, optional) – Description
  • variational (bool, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.vaegan.decoder(z, shapes, n_hidden=None, dimensions=[], filter_sizes=[], convolutional=False, activation=<function relu>, output_activation=<function relu>)[source]

Summary

Parameters:
  • z (TYPE) – Description
  • shapes (TYPE) – Description
  • n_hidden (None, optional) – Description
  • dimensions (list, optional) – Description
  • filter_sizes (list, optional) – Description
  • convolutional (bool, optional) – Description
  • activation (TYPE, optional) – Description
  • output_activation (TYPE, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.vaegan.discriminator(x, convolutional=True, filter_sizes=[5, 5, 5, 5], activation=<function relu>, n_filters=[100, 100, 100, 100])[source]

Summary

Parameters:
  • x (TYPE) – Description
  • convolutional (bool, optional) – Description
  • filter_sizes (list, optional) – Description
  • activation (TYPE, optional) – Description
  • n_filters (list, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.vaegan.encoder(x, n_hidden=None, dimensions=[], filter_sizes=[], convolutional=False, activation=<function relu>, output_activation=<function sigmoid>)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • n_hidden (None, optional) – Description
  • dimensions (list, optional) – Description
  • filter_sizes (list, optional) – Description
  • convolutional (bool, optional) – Description
  • activation (TYPE, optional) – Description
  • output_activation (TYPE, optional) – Description
Returns:

name – Description

Return type:

TYPE

cadl.vaegan.test_celeb(n_epochs=100, filter_sizes=[3, 3, 3, 3], n_filters=[100, 100, 100, 100], crop_shape=[100, 100, 3])[source]

Summary

Parameters:
  • n_epochs (int, optional) – Description
  • Longer Returned (No) –
  • ------------------
  • name (TYPE) – Description
cadl.vaegan.test_sita(n_epochs=100)[source]

Summary

Parameters:
  • n_epochs (int, optional) – Description
  • Longer Returned (No) –
  • ------------------
  • name (TYPE) – Description
cadl.vaegan.train_vaegan(files, learning_rate=1e-05, batch_size=64, n_epochs=250, n_examples=10, input_shape=[218, 178, 3], crop_shape=[64, 64, 3], crop_factor=0.8, n_filters=[100, 100, 100, 100], n_hidden=None, n_code=128, convolutional=True, variational=True, filter_sizes=[3, 3, 3, 3], activation=<function elu>, ckpt_name='vaegan.ckpt')[source]

Summary

Parameters:
  • files (TYPE) – Description
  • learning_rate (float, optional) – Description
  • batch_size (int, optional) – Description
  • n_epochs (int, optional) – Description
  • n_examples (int, optional) – Description
  • input_shape (list, optional) – Description
  • crop_shape (list, optional) – Description
  • crop_factor (float, optional) – Description
  • n_filters (list, optional) – Description
  • n_hidden (int, optional) – Description
  • n_code (int, optional) – Description
  • convolutional (bool, optional) – Description
  • variational (bool, optional) – Description
  • filter_sizes (list, optional) – Description
  • activation (TYPE, optional) – Description
  • ckpt_name (str, optional) – Description
  • Longer Returned (No) –
  • ------------------
  • name (TYPE) – Description
cadl.vaegan.variational_bayes(h, n_code)[source]

Summary

Parameters:
  • h (TYPE) – Description
  • n_code (TYPE) – Description
Returns:

name – Description

Return type:

TYPE

cadl.vctk module

VCTK Dataset download and preprocessing.

cadl.vctk.batch_generator(dataset, batch_size=32, max_sequence_length=6144, maxval=32768.0, threshold=0.2, normalize=True)[source]

Summary

Parameters:
  • dataset (TYPE) – Description
  • batch_size (int, optional) – Description
  • max_sequence_length (int, optional) – Description
  • maxval (float, optional) – Description
  • threshold (float, optional) – Description
  • normalize (bool, optional) – Description
Yields:

TYPE – Description

cadl.vctk.get_dataset(saveto='vctk', convert_to_16khz=False)[source]

Download the VCTK dataset and convert to wav files.

More info:
http://homepages.inf.ed.ac.uk/jyamagis/ page3/page58/page58.html

This interface downloads the VCTK dataset and attempts to convert the flac to wave files using ffmpeg. If you do not have ffmpeg installed, this function will not be able to convert the files to waves.

Parameters:
  • saveto (str) – Directory to save the resulting dataset [‘vctk’]
  • convert_to_16khz (bool, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.vgg16 module

VGG16 pretrained model and VGG Face model.

cadl.vgg16.deprocess(img)[source]

Summary

Parameters:img (TYPE) – Description
Returns:Description
Return type:TYPE
cadl.vgg16.get_vgg_face_model()[source]

Summary

Returns:Description
Return type:TYPE
cadl.vgg16.get_vgg_model()[source]

Summary

Returns:Description
Return type:TYPE
cadl.vgg16.preprocess(img, crop=True, resize=True, dsize=(224, 224))[source]

Summary

Parameters:
  • img (TYPE) – Description
  • crop (bool, optional) – Description
  • resize (bool, optional) – Description
  • dsize (tuple, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.vgg16.test_vgg()[source]

Loads the VGG network and applies it to a test image.

cadl.vgg16.test_vgg_face()[source]

Loads the VGG network and applies it to a test image.

cadl.wavenet module

WaveNet Autoencoder and conditional WaveNet.

cadl.wavenet.condition(x, encoding)[source]

Summary

Parameters:
  • x (TYPE) – Description
  • encoding (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.wavenet.create_wavenet(n_stages=10, n_layers_per_stage=9, n_hidden=200, batch_size=32, n_skip=100, filter_length=2, shift=True, n_quantization=256, sample_rate=16000)[source]

Summary

Parameters:
  • n_stages (int, optional) – Description
  • n_layers_per_stage (int, optional) – Description
  • n_hidden (int, optional) – Description
  • batch_size (int, optional) – Description
  • n_skip (int, optional) – Description
  • filter_length (int, optional) – Description
  • shift (bool, optional) – Description
  • n_quantization (int, optional) – Description
  • sample_rate (int, optional) – Description
Returns:

Description

Return type:

TYPE

cadl.wavenet.create_wavenet_autoencoder(n_stages, n_layers_per_stage, n_hidden, batch_size, n_skip, filter_length, bottleneck_width, hop_length, n_quantization, sample_rate)[source]

Summary

Parameters:
  • n_stages (TYPE) – Description
  • n_layers_per_stage (TYPE) – Description
  • n_hidden (TYPE) – Description
  • batch_size (TYPE) – Description
  • n_skip (TYPE) – Description
  • filter_length (TYPE) – Description
  • bottleneck_width (TYPE) – Description
  • hop_length (TYPE) – Description
  • n_quantization (TYPE) – Description
  • sample_rate (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.wavenet.get_sequence_length(n_stages, n_layers_per_stage)[source]

Summary

Parameters:
  • n_stages (TYPE) – Description
  • n_layers_per_stage (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.wavenet.test_librispeech()[source]

Summary

cadl.wavenet.train_vctk()[source]

Summary

Returns:Description
Return type:TYPE

cadl.wavenet_utils module

Various utilities for training WaveNet.

cadl.wavenet_utils.batch_to_time(X, block_size)[source]

Inverse of time_to_batch(X, block_size).

Parameters:
  • X – Tensor of shape [nb*block_size, k, n] for some natural number k.
  • block_size – number of time steps (i.e. size of dimension 1) in the output tensor.
Returns:

Return type:

Tensor of shape [nb, k*block_size, n]

cadl.wavenet_utils.causal_linear(X, n_inputs, n_outputs, name, filter_length, rate, batch_size, depth=1)[source]

Applies dilated convolution using queues. Assumes a filter_length of 2 or 3.

Parameters:
  • X – The [mb, time, channels] tensor input.
  • n_inputs – The input number of channels.
  • n_outputs – The output number of channels.
  • name – The variable scope to provide to W and biases.
  • filter_length – The length of the convolution, assumed to be 3.
  • rate – The rate or dilation
  • batch_size – Non-symbolic value for batch_size.
  • depth (int, optional) – Description
Returns:

  • y – The output of the operation
  • (init_1, init_2) – Initialization operations for the queues
  • (push_1, push_2) – Push operations for the queues

cadl.wavenet_utils.conv1d(X, num_filters, filter_length, name, dilation=1, causal=True, kernel_initializer=<tensorflow.python.ops.init_ops.UniformUnitScaling object>, biases_initializer=<tensorflow.python.ops.init_ops.Constant object>)[source]

Fast 1D convolution that supports causal padding and dilation.

Parameters:
  • X – The [mb, time, channels] float tensor that we convolve.
  • num_filters – The number of filter maps in the convolution.
  • filter_length – The integer length of the filter.
  • name – The name of the scope for the variables.
  • dilation – The amount of dilation.
  • causal – Whether or not this is a causal convolution.
  • kernel_initializer – The kernel initialization function.
  • biases_initializer – The biases initialization function.
Returns:

The output of the 1D convolution.

Return type:

y

cadl.wavenet_utils.inv_mu_law(X, mu=255)[source]

A TF implementation of inverse Mu-Law.

Parameters:
  • X – The Mu-Law samples to decode.
  • mu – The Mu we used to encode these samples.
Returns:

The decoded data.

Return type:

out

cadl.wavenet_utils.inv_mu_law_numpy(X, mu=255.0)[source]

A numpy implementation of inverse Mu-Law.

Parameters:
  • X – The Mu-Law samples to decode.
  • mu – The Mu we used to encode these samples.
Returns:

The decoded data.

Return type:

out

cadl.wavenet_utils.linear(X, n_inputs, n_outputs, name)[source]

Summary

Parameters:
  • X (TYPE) – Description
  • n_inputs (TYPE) – Description
  • n_outputs (TYPE) – Description
  • name (TYPE) – Description
Returns:

Description

Return type:

TYPE

cadl.wavenet_utils.mu_law(X, mu=255, int8=False)[source]

A TF implementation of Mu-Law encoding.

Parameters:
  • X – The audio samples to encode.
  • mu – The Mu to use in our Mu-Law.
  • int8 – Use int8 encoding.
Returns:

The Mu-Law encoded int8 data.

Return type:

out

cadl.wavenet_utils.mu_law_numpy(X, mu=255, int8=False)[source]

A TF implementation of Mu-Law encoding.

Parameters:
  • X – The audio samples to encode.
  • mu – The Mu to use in our Mu-Law.
  • int8 – Use int8 encoding.
Returns:

The Mu-Law encoded int8 data.

Return type:

out

cadl.wavenet_utils.mul_or_none(a, b)[source]

Return the element wise multiplicative of the inputs. If either input is None, we return None.

Parameters:
  • a – A tensor input.
  • b – Another tensor input with the same type as a.
Returns:

Return type:

None if either input is None. Otherwise returns a * b.

cadl.wavenet_utils.pool1d(X, window_length, name, mode='avg', stride=None)[source]

1D pooling function that supports multiple different modes.

Parameters:
  • X – The [mb, time, channels] float tensor that we are going to pool over.
  • window_length – The amount of samples we pool over.
  • name – The name of the scope for the variables.
  • mode – The type of pooling, either avg or max.
  • stride – The stride length.
Returns:

The [mb, time // stride, channels] float tensor result of pooling.

Return type:

pooled

cadl.wavenet_utils.shift_right(X)[source]

Shift the input over by one and a zero to the front.

Parameters:X – The [mb, time, channels] tensor input.
Returns:The [mb, time, channels] tensor output.
Return type:x_sliced
cadl.wavenet_utils.time_to_batch(X, block_size)[source]

Splits time dimension (i.e. dimension 1) of X into batches. Within each batch element, the k*block_size time steps are transposed, so that the k time steps in each output batch element are offset by block_size from each other. The number of input time steps must be a multiple of block_size.

Parameters:
  • X – Tensor of shape [nb, k*block_size, n] for some natural number k.
  • block_size – number of time steps (i.e. size of dimension 1) in the output tensor.
Returns:

Return type:

Tensor of shape [nb*block_size, k, n]

cadl.word2vec module

Word2Vec model.

cadl.word2vec.build_model(batch_size=128, vocab_size=50000, embedding_size=128, n_neg_samples=64)[source]

Summary

Parameters:
  • batch_size (int, optional) – Description
  • vocab_size (int, optional) – Description
  • embedding_size (int, optional) – Description
  • n_neg_samples (int, optional) – Description
Returns:

Description

Return type:

TYPE

Module contents

Copyright 2017 Parag K. Mital

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.