cadl package¶

Subpackages¶

cadl.deprecated package

Submodules¶

cadl.batch_norm module¶

Batch Normalization for TensorFlow.

cadl.batch_norm.batch_norm(x, phase_train, name='bn', decay=0.9, reuse=None, affine=True)[source]¶

Batch normalization on convolutional maps. from: https://stackoverflow.com/questions/33949786/how-could-i- use-batch-normalization-in-tensorflow Only modified to infer shape from input tensor x.

[DEPRECATED] Use tflearn or slim batch normalization instead.

Parameters:	x – Tensor, 4D BHWD input maps phase_train – boolean tf.Variable, true indicates training phase name – string, variable name decay (float, optional) – Description reuse (None, optional) – Description affine – whether to affine-transform outputs
Returns:	batch-normalized maps
Return type:	normed

cadl.celeb_vaegan module¶

Tools for downloading the celeb dataset and model, including preprocessing.

cadl.celeb_vaegan.celeb_vaegan_download()[source]¶

Download a pretrained celeb vae/gan network.

Returns:	Description
Return type:	TYPE

cadl.celeb_vaegan.get_celeb_vaegan_model()[source]¶

Get a pretrained model.

Returns:

net –

{

‘graph_def’: tf.GraphDef: The graph definition
‘labels’: list: List of different possible attributes from celeb
‘attributes’: np.ndarray: One hot encoding of the attributes per image [n_els x n_labels]
‘preprocess’: function: Preprocess function

}

Return type: dict

cadl.celeb_vaegan.preprocess(img, crop_factor=0.8)[source]¶

Replicate the preprocessing we did on the VAE/GAN.

This model used a crop_factor of 0.8 and crop size of [100, 100, 3].

Parameters:	img (TYPE) – Description crop_factor (float, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.charrnn module¶

Character-level Recurrent Neural Network.

cadl.charrnn.build_model(txt, batch_size=1, sequence_length=1, n_layers=2, n_cells=100, gradient_clip=10.0, learning_rate=0.001)[source]¶

Summary

Parameters:	txt (TYPE) – Description batch_size (int, optional) – Description sequence_length (int, optional) – Description n_layers (int, optional) – Description n_cells (int, optional) – Description gradient_clip (float, optional) – Description learning_rate (float, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.charrnn.infer(txt, ckpt_name, n_iterations, n_cells=200, n_layers=3, learning_rate=0.001, max_iter=5000, gradient_clip=10.0, init_value=[0], keep_prob=1.0, sampling='prob', temperature=1.0)[source]¶

Parameters:	txt (TYPE) – Description ckpt_name (TYPE) – Description n_iterations (TYPE) – Description n_cells (int, optional) – Description n_layers (int, optional) – Description learning_rate (float, optional) – Description max_iter (int, optional) – Description gradient_clip (float, optional) – Description init_value (list, optional) – Description keep_prob (float, optional) – Description sampling (str, optional) – Description temperature (float, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.charrnn.test_alice(max_iter=5)[source]¶

Summary

Parameters:	max_iter (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.charrnn.test_trump(max_iter=100)[source]¶

Summary

Parameters:	max_iter (int, optional) – Description

cadl.charrnn.test_wtc()[source]¶: Summary

cadl.charrnn.train(txt, batch_size=100, sequence_length=150, n_cells=200, n_layers=3, learning_rate=1e-05, max_iter=50000, gradient_clip=5.0, ckpt_name='model.ckpt', keep_prob=1.0)[source]¶

Parameters:	txt (TYPE) – Description batch_size (int, optional) – Description sequence_length (int, optional) – Description n_cells (int, optional) – Description n_layers (int, optional) – Description learning_rate (float, optional) – Description max_iter (int, optional) – Description gradient_clip (float, optional) – Description ckpt_name (str, optional) – Description keep_prob (float, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cornell module¶

Tools for downloading and preprocessing the Cornell Movie DB.

cadl.cornell.download_cornell(dst='cornell movie-dialogs corpus')[source]¶

Summary

Parameters:	dst (str, optional) – Description

cadl.cornell.get_characters(path='cornell movie-dialogs corpus')[source]¶

movie_characters_metadata.txt
- contains information about each movie character
- fields:
  
  characterID
  
  character name
  
  movieID
  
  movie title
  
  gender (”?” for unlabeled cases)
  
  position in credits (”?” for unlabeled cases)

Parameters:	path (TYPE, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cornell.get_conversations(path='cornell movie-dialogs corpus')[source]¶

movie_conversations.txt
- the structure of the conversations
- fields
  
  characterID of the first character involved in the conversation
  
  characterID of the second character involved in the conversation
  
  movieID of the movie in which the conversation occurred
  
  list of the utterances that make the conversation, in
  
  chronological order: [‘lineID1’,’lineID2’,É,’lineIDN’] has to be matched with movie_lines.txt to reconstruct the actual content

Parameters:	path (TYPE, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cornell.get_lines(path='cornell movie-dialogs corpus')[source]¶

movie_lines.txt
- contains the actual text of each utterance
- fields:
  
  lineID
  
  characterID (who uttered this phrase)
  
  movieID
  
  character name
  
  text of the utterance

Parameters:	path (TYPE, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cornell.get_scripts(path='cornell movie-dialogs corpus')[source]¶

Summary

Parameters:	path (TYPE, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cornell.get_titles(path='cornell movie-dialogs corpus')[source]¶

movie_titles_metadata.txt
- contains information about each movie title
- fields:
  
  movieID,
  
  movie title,
  
  movie year,
  
  IMDB rating,
  
  no. IMDB votes,
  
  genres in the format [‘genre1’,’genre2’,É,’genreN’]

Parameters:	path (TYPE, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cornell.id2word(ids, vocab)[source]¶

Summary

Parameters:	ids (TYPE) – Description vocab (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.cornell.preprocess(text, min_count=10, max_length=40)[source]¶

Summary

Parameters:	text (TYPE) – Description min_count (int, optional) – Description max_length (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cornell.test_decode(sentence)[source]¶

Test decoding of cornell dataset with deprecated seq2seq model.

Parameters:	sentence (TYPE) – Description

cadl.cornell.test_train()[source]¶: Test training of cornell dataset with deprecated bucketed seq2seq model.

cadl.cornell.word2id(words, vocab, UNK_ID=3)[source]¶

Summary

Parameters:	words (TYPE) – Description vocab (TYPE) – Description UNK_ID (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan module¶

Cycle Generative Adversarial Network for Unpaired Image to Image translation.

cadl.cycle_gan.batch_generator_dataset(imgs1, imgs2)[source]¶

Summary

Parameters:	imgs1 (TYPE) – Description imgs2 (TYPE) – Description
Yields:	TYPE – Description

cadl.cycle_gan.batch_generator_random_crop(X, Y, min_size=256, max_size=512, n_images=100)[source]¶

Summary

Parameters:	X (TYPE) – Description Y (TYPE) – Description min_size (int, optional) – Description max_size (int, optional) – Description n_images (int, optional) – Description
Yields:	TYPE – Description

cadl.cycle_gan.conv2d(inputs, activation_fn=<function lrelu>, normalizer_fn=<function instance_norm>, scope='conv2d', **kwargs)[source]¶

Summary

Parameters:	inputs (TYPE) – Description activation_fn (TYPE, optional) – Description normalizer_fn (TYPE, optional) – Description scope (str, optional) – Description **kwargs – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.conv2d_transpose(inputs, activation_fn=<function lrelu>, normalizer_fn=<function instance_norm>, scope='conv2d_transpose', **kwargs)[source]¶

Summary

Parameters:	inputs (TYPE) – Description activation_fn (TYPE, optional) – Description normalizer_fn (TYPE, optional) – Description scope (str, optional) – Description **kwargs – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.cycle_gan(img_size=256)[source]¶

Summary

Parameters:	img_size (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.decoder(x, n_filters=32, k_size=3, scope=None)[source]¶

Summary

Parameters:	x (TYPE) – Description n_filters (int, optional) – Description k_size (int, optional) – Description scope (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.discriminator(x, n_filters=64, k_size=4, scope=None, reuse=None)[source]¶

Summary

Parameters:	x (TYPE) – Description n_filters (int, optional) – Description k_size (int, optional) – Description scope (None, optional) – Description reuse (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.encoder(x, n_filters=32, k_size=3, scope=None)[source]¶

Summary

Parameters:	x (TYPE) – Description n_filters (int, optional) – Description k_size (int, optional) – Description scope (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.generator(x, scope=None, reuse=None)[source]¶

Summary

Parameters:	x (TYPE) – Description scope (None, optional) – Description reuse (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.get_images(path1, path2, img_size=256)[source]¶

Summary

Parameters:	path1 (TYPE) – Description path2 (TYPE) – Description img_size (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.instance_norm(x, epsilon=1e-05)[source]¶

Instance Normalization.

See Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2016). Instance Normalization: The Missing Ingredient for Fast Stylization, Retrieved from http://arxiv.org/abs/1607.08022

Parameters:	x (TYPE) – Description epsilon (float, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.l1loss(x, y)[source]¶

Summary

Parameters:	x (TYPE) – Description y (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.l2loss(x, y)[source]¶

Summary

Parameters:	x (TYPE) – Description y (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.lrelu(x, leak=0.2, name='lrelu')[source]¶

Summary

Parameters:	x (TYPE) – Description leak (float, optional) – Description name (str, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.residual_block(x, n_channels=128, kernel_size=3, scope=None)[source]¶

Summary

Parameters:	x (TYPE) – Description n_channels (int, optional) – Description kernel_size (int, optional) – Description scope (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.cycle_gan.train(ds_X, ds_Y, ckpt_path='cycle_gan', learning_rate=0.0002, n_epochs=100, img_size=256)[source]¶

Summary

Parameters:	ds_X (TYPE) – Description ds_Y (TYPE) – Description ckpt_path (str, optional) – Description learning_rate (float, optional) – Description n_epochs (int, optional) – Description img_size (int, optional) – Description

cadl.cycle_gan.transform(x, img_size=256)[source]¶

Summary

Parameters:	x (TYPE) – Description img_size (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.dataset_utils module¶

Utils for creating datasets.

class cadl.dataset_utils.Dataset(Xs, ys=None, split=[1.0, 0.0, 0.0], one_hot=False, n_classes=1)[source]¶

Bases: object

Create a dataset from data and their labels.

Allows easy use of train/valid/test splits; Batch generator.

all_idxs¶: list – All indexes across all splits.

all_inputs¶: list – All inputs across all splits.

all_labels¶: list – All labels across all splits.

n_classes¶: int – Number of labels.

split¶: list – Percentage split of train, valid, test sets.

test_idxs¶: list – Indexes of the test split.

train_idxs¶: list – Indexes of the train split.

valid_idxs¶: list – Indexes of the valid split.

X¶

Inputs/Xs/Images.

Returns:	all_inputs – Original Inputs/Xs.
Return type:	np.ndarray

Y¶

Outputs/ys/Labels.

Returns:	all_labels – Original Outputs/ys.
Return type:	np.ndarray

mean()[source]¶

Mean of the inputs/Xs.

Returns:	mean – Calculates mean across 0th (batch) dimension.
Return type:	np.ndarray

std()[source]¶

Standard deviation of the inputs/Xs.

Returns:	std – Calculates std across 0th (batch) dimension.
Return type:	np.ndarray

test¶

Test split.

Returns:	split – Split of the test dataset.
Return type:	DatasetSplit

train¶

Train split.

Returns:	split – Split of the train dataset.
Return type:	DatasetSplit

valid¶

Validation split.

Returns:	split – Split of the validation dataset.
Return type:	DatasetSplit

class cadl.dataset_utils.DatasetSplit(images, labels)[source]¶

Bases: object

Utility class for batching data and handling multiple splits.

current_batch_idx¶: int – Description

images¶: np.ndarray – Xs of the dataset. Not necessarily images.

labels¶: np.ndarray – ys of the dataset.

n_classes¶: int – Number of possible labels

num_examples¶: int – Number of total observations

next_batch(batch_size=100)[source]¶

Batch generator with randomization.

Parameters:	batch_size (int, optional) – Size of each minibatch.
Yields:	Xs, ys (np.ndarray, np.ndarray) – Next batch of inputs and labels (if no labels, then None).

cadl.dataset_utils.cifar10_download(dst='cifar10')[source]¶

Download the CIFAR10 dataset.

Parameters:	dst (str, optional) – Directory to download into.

cadl.dataset_utils.cifar10_load(dst='cifar10')[source]¶

Load the CIFAR10 dataset.

Downloads the dataset if it does not exist into the dst directory.

Parameters:	dst (str, optional) – Location of CIFAR10 dataset.
Returns:	Xs, ys – Array of data, Array of labels
Return type:	np.ndarray, np.ndarray

cadl.dataset_utils.create_input_pipeline(files, batch_size, n_epochs, shape, crop_shape=None, crop_factor=1.0, n_threads=2)[source]¶

Creates a pipefile from a list of image files. Includes batch generator/central crop/resizing options. The resulting generator will dequeue the images batch_size at a time until it throws tf.errors.OutOfRangeError when there are no more images left in the queue.

Parameters:	files (list) – List of paths to image files. batch_size (int) – Number of image files to load at a time. n_epochs (int) – Number of epochs to run before raising tf.errors.OutOfRangeError shape (list) – [height, width, channels] crop_shape (list) – [height, width] to crop image to. crop_factor (float) – Percentage of image to take starting from center. n_threads (int, optional) – Number of threads to use for batch shuffling
Returns:	Description
Return type:	TYPE

cadl.dataset_utils.dense_to_one_hot(labels, n_classes=2)[source]¶

Convert class labels from scalars to one-hot vectors.

Parameters:	labels (array) – Input labels to convert to one-hot representation. n_classes (int, optional) – Number of possible one-hot.
Returns:	one_hot – One hot representation of input.
Return type:	array

cadl.dataset_utils.gtzan_music_speech_download(dst='gtzan_music_speech')[source]¶

Download the GTZAN music and speech dataset.

Parameters:	dst (str, optional) – Location to put the GTZAN music and speech datset.

cadl.dataset_utils.gtzan_music_speech_load(dst='gtzan_music_speech')[source]¶

Load the GTZAN Music and Speech dataset.

Downloads the dataset if it does not exist into the dst directory.

Parameters:	dst (str, optional) – Location of GTZAN Music and Speech dataset.
Returns:	Xs, ys – Array of data, Array of labels
Return type:	np.ndarray, np.ndarray

cadl.dataset_utils.tiny_imagenet_download(dst='tiny_imagenet')[source]¶

Download the Tiny ImageNet dataset.

Parameters:	dst (str, optional) – Directory to download into.

cadl.dataset_utils.tiny_imagenet_load(dst='tiny_imagenet')[source]¶

Loads the paths to every file in the Tiny Imagenet Dataset.

Downloads the dataset if it does not exist into the dst directory.

Parameters:	dst (str, optional) – Location of Tiny ImageNet dataset.
Returns:	all_files – List of paths to every file in the Tiny ImageNet Dataset
Return type:	list

cadl.datasets module¶

Utils for loading common datasets.

cadl.datasets.CELEB(path='./img_align_celeba/')[source]¶

Attempt to load the files of the CELEB dataset.

Requires the files already be downloaded and placed in the dst directory. The first 100 files can be downloaded from the cadl.utils function get_celeb_files

http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html

Parameters:	path (str, optional) – Directory where the aligned/cropped celeb dataset can be found.
Returns:	files – List of file paths to the dataset.
Return type:	list

cadl.datasets.CIFAR10(flatten=True, split=[1.0, 0.0, 0.0])[source]¶

Returns the CIFAR10 dataset.

Parameters:	flatten (bool, optional) – Convert the 3 x 32 x 32 pixels to a single vector split (list, optional) – Description
Returns:	cifar – Description
Return type:	Dataset

cadl.datasets.GTZAN(path='./gtzan_music_speech')[source]¶

Load the GTZAN Music and Speech dataset.

Downloads the dataset if it does not exist into the dst directory.

Parameters:	path (str, optional) – Description
Returns:	ds (Dataset) – Dataset object with array of data in X and array of labels in Y Deleted Parameters —————— dst (str, optional) – Location of GTZAN Music and Speech dataset.

cadl.datasets.MNIST(one_hot=True, split=[1.0, 0.0, 0.0])[source]¶

Returns the MNIST dataset.

Returns:	mnist – DataSet object w/ convenienve props for accessing train/validation/test sets and batches.
Return type:	DataSet
Parameters:	one_hot (bool, optional) – Description split (list, optional) – Description

cadl.datasets.TINYIMAGENET(path='./tiny_imagenet/')[source]¶

Attempt to load the files of the Tiny ImageNet dataset.

http://cs231n.stanford.edu/tiny-imagenet-200.zip https://tiny-imagenet.herokuapp.com/

Parameters:	path (str, optional) – Directory where the dataset can be found or else will be placed.
Returns:	files (list) – List of file paths to the dataset. labels (list) – List of labels for each file (only training files have labels)

cadl.deepdream module¶

Deep Dream using the Inception v5 network.

cadl.deepdream.deep_dream(input_img, downsize=False, model='inception', layer_i=-1, neuron_i=-1, n_iterations=100, save_gif=None, save_images='imgs', device='/cpu:0', **kwargs)[source]¶

Deep Dream with the given parameters.

Parameters:	input_img (np.ndarray) – Image to apply deep dream to. Should be 3-dimenionsal H x W x C RGB uint8 or float32. downsize (bool, optional) – Whether or not to downsize the image. Only applies to model==’inception’. model (str, optional) – Which model to load. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’. layer_i (int, optional) – Which layer to use for finding the gradient. E.g. the softmax layer for inception is -1, for vgg networks it is -2. Use the function “get_layer_names” to find the layer number that you need. neuron_i (int, optional) – Which neuron to use. -1 for the entire layer. n_iterations (int, optional) – Number of iterations to dream. save_gif (bool, optional) – Save a GIF. save_images (str, optional) – Folder to save images to. device (str, optional) – Which device to use, e.g. [‘/cpu:0’] or ‘/gpu:0’. *kwargs (dict*) – See “_apply” for additional parameters.
Returns:	imgs – Images of every iteration
Return type:	list of np.array

cadl.deepdream.get_labels(model='inception')[source]¶

Return labels corresponding to the neuron_i parameter of deep dream.

Parameters:	model (str, optional) – Which model to load. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’.
Raises:	`ValueError` – Unknown model. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’.
Returns:	Description
Return type:	TYPE

cadl.deepdream.get_layer_names(model='inception')[source]¶

Retun every layer’s index and name in the given model.

Parameters:	model (str, optional) – Which model to load. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’.
Returns:	names – The index and layer’s name for every layer in the given model.
Return type:	list of tuples

cadl.deepdream.guided_dream(input_img, guide_img=None, downsize=False, layers=[162, 183, 184, 247], label_i=962, layer_i=-1, feature_loss_weight=1.0, tv_loss_weight=1.0, l2_loss_weight=1.0, softmax_loss_weight=1.0, model='inception', neuron_i=920, n_iterations=100, save_gif=None, save_images='imgs', device='/cpu:0', **kwargs)[source]¶

Deep Dream v2. Use an optional guide image and other techniques.

Parameters:	input_img (np.ndarray) – Image to apply deep dream to. Should be 3-dimenionsal H x W x C RGB uint8 or float32. guide_img (np.ndarray, optional) – Optional image to find features at different layers for. Must pass in a list of layers that you want to find features for. Then the guided dream will try to match this images features at those layers. downsize (bool, optional) – Whether or not to downsize the image. Only applies to model==’inception’. layers (list, optional) – A list of layers to find features for in the “guide_img”. label_i (int, optional) – Which label to use for the softmax layer. Use the “get_labels” function to find the index corresponding the object of interest. If None, not used. layer_i (int, optional) – Which layer to use for finding the gradient. E.g. the softmax layer for inception is -1, for vgg networks it is -2. Use the function “get_layer_names” to find the layer number that you need. feature_loss_weight (float, optional) – Weighting for the feature loss from the guide_img. tv_loss_weight (float, optional) – Total variational loss weighting. Enforces smoothness. l2_loss_weight (float, optional) – L2 loss weighting. Enforces smaller values and reduces saturation. softmax_loss_weight (float, optional) – Softmax loss weighting. Must set label_i. model (str, optional) – Which model to load. Must be one of: [‘inception’], ‘i2v_tag’, ‘i2v’, ‘vgg16’, or ‘vgg_face’. neuron_i (int, optional) – Which neuron to use. -1 for the entire layer. n_iterations (int, optional) – Number of iterations to dream. save_gif (bool, optional) – Save a GIF. save_images (str, optional) – Folder to save images to. device (str, optional) – Which device to use, e.g. [‘/cpu:0’] or ‘/gpu:0’. *kwargs (dict*) – See “_apply” for additional parameters.
Returns:	imgs – Images of the dream.
Return type:	list of np.ndarray

cadl.dft module¶

Utils for performing a DFT using numpy.

cadl.dft.ctoz(mag, phs)[source]¶

Summary

Parameters:	mag (TYPE) – Description phs (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.dft.dft_np(signal, hop_size=256, fft_size=512)[source]¶

Summary

Parameters:	signal (TYPE) – Description hop_size (int, optional) – Description fft_size (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.dft.idft_np(re, im, hop_size=256, fft_size=512)[source]¶

Summary

Parameters:	re (TYPE) – Description im (TYPE) – Description hop_size (int, optional) – Description fft_size (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.dft.ztoc(re, im)[source]¶

Summary

Parameters:	re (TYPE) – Description im (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.draw module¶

Deep Recurrent Attentive Writer.

cadl.draw.binary_cross_entropy(t, o, eps=1e-10)[source]¶

Summary

Parameters:	t (TYPE) – Description o (TYPE) – Description eps (float, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.draw.create_attention_map(h_dec, reuse=None)[source]¶

Summary

Parameters:	h_dec (TYPE) – Description reuse (None, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.draw.create_filterbank(g_x, g_y, log_sigma_sq, log_delta, A=28, B=28, C=1, N=12)[source]¶

summary

Parameters:

g_x (TYPE) – Description
g_y (TYPE) – Description
log_sigma_sq (TYPE) – Description
log_delta (TYPE) – Description
A (int, optional) – Description
B (int, optional) – Description
C (int, optional) – Description
N (int, optional) – Description

Returns:

name (TYPE) – Description
Deleted Parameters
——————
log_sigma (type) – description

cadl.draw.create_model(A=28, B=28, C=1, T=16, batch_size=100, n_enc=128, n_z=32, n_dec=128, read_n=12, write_n=12)[source]¶: <FRESHLY_INSERTED>

cadl.draw.decoder(z, rnn, batch_size, state=None, n_dec=64, reuse=None)[source]¶

Summary

Parameters:	z (TYPE) – Description rnn (TYPE) – Description batch_size (TYPE) – Description state (None, optional) – Description n_dec (int, optional) – Description reuse (None, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.draw.encoder(x, rnn, batch_size, state=None, n_enc=64, reuse=None)[source]¶

Summary

Parameters:	x (TYPE) – Description rnn (TYPE) – Description batch_size (TYPE) – Description state (None, optional) – Description n_enc (int, optional) – Description reuse (None, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.draw.filter_image(x, F_x, F_y, log_gamma, A, B, C, N, inverted=False)[source]¶

Summary

Parameters:

x (TYPE) – Description
F_x (TYPE) – Description
F_y (TYPE) – Description
log_gamma (TYPE) – Description
A (TYPE) – Description
B (TYPE) – Description
C (TYPE) – Description
N (TYPE) – Description
inverted (bool, optional) – Description

Returns:

name (TYPE) – Description
Deleted Parameters
——————
gamma (TYPE) – Description

cadl.draw.linear(x, n_output)[source]¶

Summary

Parameters:	x (TYPE) – Description n_output (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.draw.read(x_t, x_hat_t, h_dec_t, read_n=5, A=28, B=28, C=1, use_attention=True, reuse=None)[source]¶

Read from the input image, x, and reconstruction error image x_hat.

Optionally apply a filterbank w/ use_attention.

Parameters:	x_t (tf.Tensor) – Input image to optionally filter x_hat_t (tf.Tensor) – Reconstruction error to optionally filter h_dec_t (tf.Tensor) – Output of the decoder of the network (could also be the encoder but the authors suggest to use the decoder instead, see end of section 2.1) read_n (int, optional) – Description A (int, optional) – Description B (int, optional) – Description C (int, optional) – Description use_attention (bool, optional) – Description reuse (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.draw.test_mnist()[source]¶

cadl.draw.train_dataset(ds, A, B, C, T=20, n_enc=512, n_z=200, n_dec=512, read_n=12, write_n=12, batch_size=100, n_epochs=100)[source]¶

cadl.draw.train_input_pipeline(files, A, B, C, T=20, n_enc=512, n_z=256, n_dec=512, read_n=15, write_n=15, batch_size=64, n_epochs=1000000000.0, input_shape=(64, 64, 3))[source]¶

cadl.draw.variational_layer(h_enc, noise, n_z=2, reuse=None)[source]¶

Summary

Parameters:	h_enc (TYPE) – Description noise (TYPE) – Description n_z (int, optional) – Description reuse (None, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.draw.write(h_dec_t, write_n=5, A=28, B=28, C=1, use_attention=True, reuse=None)[source]¶

Summary

Parameters:	h_dec_t (TYPE) – Description write_n (int, optional) – Description A (int, optional) – Description B (int, optional) – Description C (int, optional) – Description use_attention (bool, optional) – Description reuse (None, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.fastwavenet module¶

WaveNet Training and Fast WaveNet Decoding.

From the following paper¶

Ramachandran, P., Le Paine, T., Khorrami, P., Babaeizadeh, M., Chang, S., Zhang, Y., … Huang, T. (2017). Fast Generation For Convolutional Autoregressive Models, 1–5.

cadl.fastwavenet.create_generation_model(n_stages=5, n_layers_per_stage=10, n_hidden=256, batch_size=1, n_skip=128, n_quantization=256, filter_length=2, onehot=False)[source]¶

Summary

Parameters:	n_stages (int, optional) – Description n_layers_per_stage (int, optional) – Description n_hidden (int, optional) – Description batch_size (int, optional) – Description n_skip (int, optional) – Description n_quantization (int, optional) – Description filter_length (int, optional) – Description onehot (bool, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.fastwavenet.get_sequence_length(n_stages, n_layers_per_stage)[source]¶

Summary

Parameters:	n_stages (TYPE) – Description n_layers_per_stage (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.fastwavenet.test_librispeech()[source]¶: Summary

cadl.gan module¶

Generative Adversarial Network.

cadl.gan.GAN(input_shape, n_latent, n_features, rgb, debug=True)[source]¶

Summary

Parameters:	input_shape (TYPE) – Description n_latent (TYPE) – Description n_features (TYPE) – Description rgb (TYPE) – Description debug (bool, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.gan.decoder(z, dimensions=[], channels=[], filter_sizes=[], convolutional=False, activation=<function relu>, output_activation=<function tanh>, reuse=None)[source]¶

Decoder network codes input x to layers defined by dimensions.

In contrast with encoder, this requires information on the number of output channels in each layer for convolution. Otherwise, it is mostly the same.

Parameters:	z (tf.Tensor) – Input to the decoder network, e.g. tf.Placeholder or tf.Variable dimensions (list, optional) – List of the number of neurons in each layer (convolutional=False) -or- List of the number of filters in each layer (convolutional=True), e.g. [100, 100, 100, 100] for a 4-layer deep network with 100 in each layer. channels (list, optional) – For decoding when convolutional=True, require the number of output channels in each layer. filter_sizes (list, optional) – List of the size of the kernel in each layer, e.g.: [3, 3, 3, 3] is a 4-layer deep network w/ 3 x 3 kernels in every layer. convolutional (bool, optional) – Whether or not to use convolutional layers. activation (fn, optional) – Function for applying an activation, e.g. tf.nn.relu output_activation (fn, optional) – Function for applying an activation on the last layer, e.g. tf.nn.relu reuse (bool, optional) – For each layer’s variable scope, whether to reuse existing variables.
Returns:	h – Output tensor of the decoder
Return type:	tf.Tensor

cadl.gan.discriminator(x, convolutional=True, n_features=32, rgb=False, reuse=False)[source]¶

Summary

Parameters:	x (TYPE) – Description convolutional (bool, optional) – Description n_features (int, optional) – Description rgb (bool, optional) – Description reuse (bool, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.gan.encoder(x, dimensions=[], filter_sizes=[], convolutional=False, activation=<function relu>, output_activation=<function sigmoid>, reuse=False)[source]¶

Encoder network codes input x to layers defined by dimensions.

Parameters:	x (tf.Tensor) – Input to the encoder network, e.g. tf.Placeholder or tf.Variable dimensions (list, optional) – List of the number of neurons in each layer (convolutional=False) -or- List of the number of filters in each layer (convolutional=True), e.g. [100, 100, 100, 100] for a 4-layer deep network with 100 in each layer. filter_sizes (list, optional) – List of the size of the kernel in each layer, e.g.: [3, 3, 3, 3] is a 4-layer deep network w/ 3 x 3 kernels in every layer. convolutional (bool, optional) – Whether or not to use convolutional layers. activation (fn, optional) – Function for applying an activation, e.g. tf.nn.relu output_activation (fn, optional) – Function for applying an activation on the last layer, e.g. tf.nn.relu reuse (bool, optional) – For each layer’s variable scope, whether to reuse existing variables.
Returns:	h – Output tensor of the encoder
Return type:	tf.Tensor

cadl.gan.generator(z, output_h, output_w, convolutional=True, n_features=32, rgb=False, reuse=None)[source]¶

Simple interface to build a decoder network given the input parameters.

Parameters:	z (tf.Tensor) – Input to the generator, i.e. tf.Placeholder of tf.Variable output_h (int) – Final generated height output_w (int) – Final generated width convolutional (bool, optional) – Whether or not to build a convolutional generative network. n_features (int, optional) – Number of channels to use in the last hidden layer. rgb (bool, optional) – Whether or not the final generated image is RGB or not. reuse (None, optional) – Whether or not to reuse the variables if they are already created.
Returns:	x_tilde – Output of the generator network.
Return type:	tf.Tensor

cadl.gan.train_input_pipeline(files, init_lr_g=0.0001, init_lr_d=0.0001, n_features=10, n_latent=100, n_epochs=1000000, batch_size=200, n_samples=15, input_shape=[218, 178, 3], crop_shape=[64, 64, 3], crop_factor=0.8)[source]¶

Summary

Parameters:

files (TYPE) – Description
init_lr_g (float, optional) – Description
init_lr_d (float, optional) – Description
n_features (int, optional) – Description
n_latent (int, optional) – Description
n_epochs (int, optional) – Description
batch_size (int, optional) – Description
n_samples (int, optional) – Description
input_shape (list, optional) – Description
crop_shape (list, optional) – Description
crop_factor (float, optional) – Description
Longer Returned (No) –
------------------ –
name (TYPE) – Description

cadl.gif module¶

Utility for creating a GIF.

cadl.gif.build_gif(imgs, interval=0.1, dpi=72, save_gif=True, saveto='animation.gif', show_gif=False, cmap=None)[source]¶

Take an array or list of images and create a GIF.

Parameters:	imgs (np.ndarray or list) – List of images to create a GIF of interval (float, optional) – Spacing in seconds between successive images. dpi (int, optional) – Dots per inch. save_gif (bool, optional) – Whether or not to save the GIF. saveto (str, optional) – Filename of GIF to save. show_gif (bool, optional) – Whether or not to render the GIF using plt. cmap (None, optional) – Optional colormap to apply to the images.
Returns:	ani – The artist animation from matplotlib. Likely not useful.
Return type:	matplotlib.animation.ArtistAnimation

cadl.glove module¶

Global Vector Embeddings.

cadl.glove.course_example()[source]¶: Summary

cadl.glove.get_model()[source]¶

Summary

Returns:	Description
Return type:	TYPE

cadl.i2v module¶

Illustration2Vec model and preprocessing.

cadl.i2v.deprocess(img)[source]¶

Summary

Parameters:	img (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.i2v.get_i2v_model()[source]¶

Get a pretrained i2v network.

Returns:	net – {‘graph_def’: graph_def, ‘labels’: synsets} where the graph_def is a tf.GraphDef and the synsets map an integer label from 0-1000 to a list of names
Return type:	dict

cadl.i2v.get_i2v_tag_model()[source]¶

Get a pretrained i2v tag network.

Returns:	net – {‘graph_def’: graph_def, ‘labels’: synsets} where the graph_def is a tf.GraphDef and the synsets map an integer label from 0-1000 to a list of names
Return type:	dict

cadl.i2v.i2v_download()[source]¶

Download a pretrained i2v network.

Returns:	Description
Return type:	TYPE

cadl.i2v.i2v_tag_download()[source]¶

Download a pretrained i2v network.

Returns:	Description
Return type:	TYPE

cadl.i2v.preprocess(img, crop=True, resize=True, dsize=(224, 224))[source]¶

Summary

Parameters:	img (TYPE) – Description crop (bool, optional) – Description resize (bool, optional) – Description dsize (tuple, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.i2v.test_i2v()[source]¶: Loads the i2v network and applies it to a test image.

cadl.inception module¶

Inception model, download, and preprocessing.

cadl.inception.deprocess(img)[source]¶

Summary

Parameters:	img (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.inception.get_inception_model(data_dir='inception', version='v5')[source]¶

Get a pretrained inception network.

Parameters:	data_dir (str, optional) – Location of the pretrained inception network download. version (str, optional) – Version of the model: [‘v3’] or ‘v5’.
Returns:	net – {‘graph_def’: graph_def, ‘labels’: synsets} where the graph_def is a tf.GraphDef and the synsets map an integer label from 0-1000 to a list of names
Return type:	dict

cadl.inception.inception_download(data_dir='inception', version='v5')[source]¶

Download a pretrained inception network.

Parameters:	data_dir (str, optional) – Location of the pretrained inception network download. version (str, optional) – Version of the model: [‘v3’] or ‘v5’.
Returns:	Description
Return type:	TYPE

cadl.inception.preprocess(img, crop=True, resize=True, dsize=(299, 299))[source]¶

Summary

Parameters:	img (TYPE) – Description crop (bool, optional) – Description resize (bool, optional) – Description dsize (tuple, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.inception.test_inception()[source]¶: Loads the inception network and applies it to a test image.

cadl.librispeech module¶

LibriSpeech dataset, batch processing, and preprocessing.

cadl.librispeech.batch_generator(dataset, batch_size=32, max_sequence_length=6144, maxval=32768.0, threshold=0.2, normalize=True)[source]¶

Summary

Parameters:	dataset (TYPE) – Description batch_size (int, optional) – Description max_sequence_length (int, optional) – Description maxval (float, optional) – Description threshold (float, optional) – Description normalize (bool, optional) – Description
Yields:	TYPE – Description

cadl.librispeech.get_dataset(saveto='librispeech', convert_to_wav=False, kind='dev')[source]¶

Download the LibriSpeech dataset and convert to wav files.

More info: http://www.openslr.org/12/

This interface downloads the LibriSpeech dataset and attempts to convert the flac to wave files using ffmpeg. If you do not have ffmpeg installed, this function will not be able to convert the files to waves.

Parameters:	saveto (str) – Directory to save the resulting dataset [‘librispeech’] convert_to_wav (bool, optional) – Description kind (str, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.magenta_utils module¶

cadl.nb_utils module¶

Utility for displaying Tensorflow graphs.

cadl.nb_utils.show_graph(graph_def)[source]¶

Summary

Parameters:	graph_def (TYPE) – Description

cadl.nsynth module¶

NSynth: WaveNet Autoencoder.

class cadl.nsynth.Config(encoding, train_path=None)[source]¶

Bases: object

Configuration object that helps manage the graph.

ae_bottleneck_width¶: int – Description

ae_hop_length¶: int – Description

encoding¶: TYPE – Description

learning_rate_schedule¶: TYPE – Description

num_iters¶: int – Description

train_path¶: TYPE – Description

build(inputs, is_training)[source]¶

Build the graph for this configuration.

Parameters:

inputs – A dict of inputs. For training, should contain ‘wav’.
is_training – Whether we are training or not. Not used in this config.

Returns:

A dict of outputs that includes the ‘predictions’, ‘loss’, the ‘encoding’,
the ‘quantized_input’, and whatever metrics we want to track for eval.

get_batch(batch_size)[source]¶

Summary

Parameters:	batch_size (TYPE) – Description
Returns:	Description
Return type:	TYPE

class cadl.nsynth.FastGenerationConfig[source]¶

Bases: object

Configuration object that helps manage the graph.

build(inputs)[source]¶

Build the graph for this configuration.

Parameters:	inputs – A dict of inputs. For training, should contain ‘wav’.
Returns:	A dict of outputs that includes the ‘predictions’, ‘loss’, the ‘encoding’, the ‘quantized_input’, and whatever metrics we want to track for eval. Deleted Parameters —————— is_training – Whether we are training or not. Not used in this config.

cadl.nsynth.causal_linear(x, n_inputs, n_outputs, name, filter_length, rate, batch_size)[source]¶

Summary

Parameters:	x (TYPE) – Description n_inputs (TYPE) – Description n_outputs (TYPE) – Description name (TYPE) – Description filter_length (TYPE) – Description rate (TYPE) – Description batch_size (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.nsynth.get_model()[source]¶: Summary

cadl.nsynth.inv_mu_law(x, mu=255.0)[source]¶

A TF implementation of inverse Mu-Law.

Parameters:	x – The Mu-Law samples to decode. mu – The Mu we used to encode these samples.
Returns:	The decoded data.
Return type:	out

cadl.nsynth.linear(x, n_inputs, n_outputs, name)[source]¶

Summary

Parameters:	x (TYPE) – Description n_inputs (TYPE) – Description n_outputs (TYPE) – Description name (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.nsynth.load_audio(wav_file, sample_length=64000)[source]¶

Summary

Parameters:	wav_file (TYPE) – Description sample_length (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.nsynth.load_fastgen_nsynth(batch_size=1, sample_length=64000)[source]¶

Summary

Parameters:	batch_size (int, optional) – Description sample_length (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.nsynth.load_nsynth(encoding=True, batch_size=1, sample_length=64000)[source]¶

Summary

Parameters:	encoding (bool, optional) – Description batch_size (int, optional) – Description sample_length (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.nsynth.synthesize(wav_file, out_file='synthesis.wav', sample_length=64000, synth_length=16000, ckpt_path='./model.ckpt-200000', resample_encoding=False)[source]¶

Summary

Parameters:	wav_file (TYPE) – Description out_file (str, optional) – Description sample_length (int, optional) – Description synth_length (int, optional) – Description ckpt_path (str, optional) – Description resample_encoding (bool, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.pixelcnn module¶

Conditional Gated Pixel CNN.

cadl.pixelcnn.build_conditional_pixel_cnn_model(B=None, H=32, W=32, C=3, n_conditionals=None)[source]¶

Conditional Gated Pixel CNN Model.

van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., & Kavukcuoglu, K. (2016). Conditional Image Generation with PixelCNN Decoders.

Implements most of the paper, except for the autoencoder, triplet loss of face embeddings, and pad/crop/shift ops for convolution (as it is not as clear IMO from a pedagogical point of view).

Parameters:	B (None, optional) – Description H (int, optional) – Description W (int, optional) – Description C (int, optional) – Description n_conditionals (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.pixelcnn.gated_conv2d(X, K_h, K_w, K_c, strides=[1, 1, 1, 1], padding='SAME', mask=None, cond_h=None, vertical_h=None)[source]¶

Summary

Parameters:	X (TYPE) – Description K_h (TYPE) – Description K_w (TYPE) – Description K_c (TYPE) – Description strides (list, optional) – Description padding (str, optional) – Description mask (None, optional) – Description cond_h (None, optional) – Description vertical_h (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.pixelcnn.generate()[source]¶: Summary

cadl.pixelcnn.train_tiny_imagenet(ckpt_path='pixelcnn', n_epochs=1000, save_step=100, write_step=25, B=32, H=64, W=64, C=3)[source]¶

Summary

Parameters:	ckpt_path (str, optional) – Description n_epochs (int, optional) – Description save_step (int, optional) – Description write_step (int, optional) – Description B (int, optional) – Description H (int, optional) – Description W (int, optional) – Description C (int, optional) – Description

cadl.pixelrnn module¶

Basic PixelRNN i.e. CharRNN style, none of the fancy ones (i.e. Row, Diag, BiDiag).

cadl.pixelrnn.build_pixel_rnn_basic_model(B=50, H=32, W=32, C=32, n_units=100, n_layers=2)[source]¶

Summary

Parameters:	B (int, optional) – Description H (int, optional) – Description W (int, optional) – Description C (int, optional) – Description n_units (int, optional) – Description n_layers (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.pixelrnn.infer(sess, net, H, W, C, pixel_value=128, state=None)[source]¶

Summary

Parameters:	sess (TYPE) – Description net (TYPE) – Description H (TYPE) – Description W (TYPE) – Description C (TYPE) – Description pixel_value (int, optional) – Description state (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.pixelrnn.train_tiny_imagenet()[source]¶: Summary

cadl.seq2seq module¶

Sequence to Sequence models w/ Attention and BiDirectional Dynamic RNNs.

cadl.seq2seq.batch_generator(sources, targets, source_lengths, target_lengths, batch_size=50)[source]¶

Summary

Parameters:	sources (TYPE) – Description targets (TYPE) – Description source_lengths (TYPE) – Description target_lengths (TYPE) – Description batch_size (int, optional) – Description
Yields:	TYPE – Description

cadl.seq2seq.create_model(source_vocab_size=10000, target_vocab_size=10000, input_embed_size=512, target_embed_size=512, share_input_and_target_embedding=True, n_neurons=512, n_layers=4, use_attention=True, max_sequence_size=30)[source]¶

Summary

Parameters:	source_vocab_size (int, optional) – Description target_vocab_size (int, optional) – Description input_embed_size (int, optional) – Description target_embed_size (int, optional) – Description share_input_and_target_embedding (bool, optional) – Description n_neurons (int, optional) – Description n_layers (int, optional) – Description use_attention (bool, optional) – Description max_sequence_size (int, optional) – Description
Returns:	Description
Return type:	TYPE
Raises:	`ValueError` – Description

cadl.seq2seq.id2word(ids, vocab)[source]¶

Summary

Parameters:	ids (TYPE) – Description vocab (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.seq2seq.preprocess(text, min_count=5, min_length=3, max_length=30)[source]¶

Summary

Parameters:	text (TYPE) – Description min_count (int, optional) – Description min_length (int, optional) – Description max_length (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.seq2seq.train(text, max_sequence_size=20, use_attention=True, min_count=25, min_length=5, n_epochs=1000, batch_size=100)[source]¶

Summary

Parameters:	text (TYPE) – Description max_sequence_size (int, optional) – Description use_attention (bool, optional) – Description min_count (int, optional) – Description min_length (int, optional) – Description n_epochs (int, optional) – Description batch_size (int, optional) – Description

cadl.seq2seq.train_cornell(**kwargs)[source]¶

Summary

Parameters:	**kwargs – Description
Returns:	Description
Return type:	TYPE

cadl.seq2seq.word2id(words, vocab)[source]¶

Summary

Parameters:	words (TYPE) – Description vocab (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.squeezenet module¶

SqueezeNet

cadl.squeezenet.fire_module(input, fire_id, channel, s1, e1, e3)[source]¶

Basic module that makes up the SqueezeNet architecture. It has two layers.

Squeeze layer (1x1 convolutions)
Expand layer (1x1 and 3x3 convolutions)

Parameters:	input (TYPE) – Tensorflow tensor fire_id (TYPE) – Variable scope name channel (TYPE) – Depth of the previous output s1 (TYPE) – Number of filters for squeeze 1x1 layer e1 (TYPE) – Number of filters for expand 1x1 layer e3 (TYPE) – Number of filters for expand 3x3 layer input – Description fire_id – Description channel – Description s1 – Description e1 – Description e3 – Description
Returns:	Tensorflow tensor
Returns:	Description
Return type:	TYPE

cadl.squeezenet.squeeze_net(input, classes)[source]¶

SqueezeNet model written in tensorflow. It provides AlexNet level accuracy with 50x fewer parameters and smaller model size. :param input: Input tensor (4D) :param classes: number of classes for classification :return: Tensorflow tensor

Parameters:	input (TYPE) – Description classes (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.stats module¶

cadl.stylenet module¶

Style Net w/ tests for Video Style Net.

cadl.stylenet.make_4d(img)[source]¶

Create a 4-dimensional N x H x W x C image.

Parameters:	img (np.ndarray) – Given image as H x W x C or H x W.
Returns:	img – N x H x W x C image.
Return type:	np.ndarray
Raises:	`ValueError` – Unexpected number of dimensions.

cadl.stylenet.stylize(content_img, style_img, base_img=None, saveto=None, gif_step=5, n_iterations=100, style_weight=1.0, content_weight=1.0)[source]¶

Stylization w/ the given content and style images.

Follows the approach in Leon Gatys et al.

Parameters:	content_img (np.ndarray) – Image to use for finding the content features. style_img (TYPE) – Image to use for finding the style features. base_img (None, optional) – Image to use for the base content. Can be noise or an existing image. If None, the content image will be used. saveto (str, optional) – Name of GIF image to write to, e.g. “stylization.gif” gif_step (int, optional) – Modulo of iterations to save the current stylization. n_iterations (int, optional) – Number of iterations to run for. style_weight (float, optional) – Weighting on the style features. content_weight (float, optional) – Weighting on the content features.
Returns:	stylization – Final iteration of the stylization.
Return type:	np.ndarray

cadl.stylenet.test()[source]¶: Test for artistic stylization.

cadl.stylenet.test_video(style_img='arles.jpg', videodir='kurosawa')[source]¶

cadl.stylenet.warp_img(img, dx, dy)[source]¶

Apply the motion vectors to the given image.

Parameters:	img (np.ndarray) – Input image to apply motion to. dx (np.ndarray) – H x W matrix defining the magnitude of the X vector dy (np.ndarray) – H x W matrix defining the magnitude of the Y vector
Returns:	img – Image with pixels warped according to dx, dy.
Return type:	np.ndarray

cadl.tedlium module¶

TEDLium Dataset.

cadl.tedlium.get_dataset()[source]¶

Summary

Returns:	Description
Return type:	TYPE

cadl.utils module¶

Various utilities including downloading, common layers, etc..

cadl.utils.bias_variable(shape, **kwargs)[source]¶

Helper function to create a bias variable initialized with a constant value.

Parameters:	shape (list) – Size of weight variable **kwargs – Description
Returns:	Description
Return type:	TYPE

cadl.utils.binary_cross_entropy(z, x, name=None)[source]¶

Binary Cross Entropy measures cross entropy of a binary variable.

loss(x, z) = - sum_i (x[i] * log(z[i]) + (1 - x[i]) * log(1 - z[i]))

Parameters:	z (tf.Tensor) – A Tensor of the same type and shape as x. x (tf.Tensor) – A Tensor of type float32 or float64. name (None, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.utils.build_submission(filename, file_list, optional_file_list=())[source]¶

Helper utility to check homework assignment submissions and package them.

Parameters:	filename (str) – Output zip file name file_list (tuple) – Tuple of files to include optional_file_list (tuple, optional) – Description

cadl.utils.conv2d(x, n_output, k_h=5, k_w=5, d_h=2, d_w=2, padding='SAME', name='conv2d', reuse=None)[source]¶

Helper for creating a 2d convolution operation.

Parameters:	x (tf.Tensor) – Input tensor to convolve. n_output (int) – Number of filters. k_h (int, optional) – Kernel height k_w (int, optional) – Kernel width d_h (int, optional) – Height stride d_w (int, optional) – Width stride padding (str, optional) – Padding type: “SAME” or “VALID” name (str, optional) – Variable scope reuse (None, optional) – Description
Returns:	op – Output of convolution
Return type:	tf.Tensor

cadl.utils.convolve(img, kernel)[source]¶

Use Tensorflow to convolve a 4D image with a 4D kernel.

Parameters:	img (np.ndarray) – 4-dimensional image shaped N x H x W x C kernel (np.ndarray) – 4-dimensional image shape K_H, K_W, C_I, C_O corresponding to the kernel’s height and width, the number of input channels, and the number of output channels. Note that C_I should = C.
Returns:	result – Convolved result.
Return type:	np.ndarray

cadl.utils.corrupt(x)[source]¶

Take an input tensor and add uniform masking.

Parameters:	x (Tensor/Placeholder) – Input to corrupt.
Returns:	x_corrupted – 50 pct of values corrupted.
Return type:	Tensor

cadl.utils.deconv2d(x, n_output_h, n_output_w, n_output_ch, n_input_ch=None, k_h=5, k_w=5, d_h=2, d_w=2, padding='SAME', name='deconv2d', reuse=None)[source]¶

Deconvolution helper.

Parameters:	x (tf.Tensor) – Input tensor to convolve. n_output_h (int) – Height of output n_output_w (int) – Width of output n_output_ch (int) – Number of filters. n_input_ch (None, optional) – Description k_h (int, optional) – Kernel height k_w (int, optional) – Kernel width d_h (int, optional) – Height stride d_w (int, optional) – Width stride padding (str, optional) – Padding type: “SAME” or “VALID” name (str, optional) – Variable scope reuse (None, optional) – Description
Returns:	op – Output of deconvolution
Return type:	tf.Tensor

cadl.utils.download(path)[source]¶

Use urllib to download a file.

Parameters:	path (str) – Url to download
Returns:	path – Location of downloaded file.
Return type:	str

cadl.utils.download_and_extract_tar(path, dst)[source]¶

Download and extract a tar file.

Parameters:	path (str) – Url to tar file to download. dst (str) – Location to save tar file contents.

cadl.utils.download_and_extract_zip(path, dst)[source]¶

Download and extract a zip file.

Parameters:	path (str) – Url to zip file to download. dst (str) – Location to save zip file contents.

cadl.utils.exists(site)[source]¶

Summary

Parameters:	site (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.utils.flatten(x, name=None, reuse=None)[source]¶

Flatten Tensor to 2-dimensions.

Parameters:	x (tf.Tensor) – Input tensor to flatten. name (None, optional) – Variable scope for flatten operations reuse (None, optional) – Description
Returns:	flattened – Flattened tensor.
Return type:	tf.Tensor
Raises:	`ValueError` – Description

cadl.utils.gabor(ksize=32)[source]¶

Use Tensorflow to compute a 2D Gabor Kernel.

Parameters:	ksize (int, optional) – Size of kernel.
Returns:	gabor – Gabor kernel with ksize x ksize dimensions.
Return type:	np.ndarray

cadl.utils.gauss(mean, stddev, ksize)[source]¶

Use Tensorflow to compute a Gaussian Kernel.

Parameters:	mean (float) – Mean of the Gaussian (e.g. 0.0). stddev (float) – Standard Deviation of the Gaussian (e.g. 1.0). ksize (int) – Size of kernel (e.g. 16).
Returns:	kernel – Computed Gaussian Kernel using Tensorflow.
Return type:	np.ndarray

cadl.utils.gauss2d(mean, stddev, ksize)[source]¶

Use Tensorflow to compute a 2D Gaussian Kernel.

Parameters:	mean (float) – Mean of the Gaussian (e.g. 0.0). stddev (float) – Standard Deviation of the Gaussian (e.g. 1.0). ksize (int) – Size of kernel (e.g. 16).
Returns:	kernel – Computed 2D Gaussian Kernel using Tensorflow.
Return type:	np.ndarray

cadl.utils.get_celeb_files(dst='img_align_celeba', max_images=100)[source]¶

Download the first 100 images of the celeb dataset.

Files will be placed in a directory ‘img_align_celeba’ if one doesn’t exist.

Returns:	files – Locations to the first 100 images of the celeb net dataset.
Return type:	list of strings
Parameters:	dst (str, optional) – Description max_images (int, optional) – Description

cadl.utils.get_celeb_imgs(max_images=100)[source]¶

Load the first max_images images of the celeb dataset.

Returns:	imgs – List of the first 100 images from the celeb dataset
Return type:	list of np.ndarray
Parameters:	max_images (int, optional) – Description

cadl.utils.imcrop_tosquare(img)[source]¶

Make any image a square image.

Parameters:	img (np.ndarray) – Input image to crop, assumed at least 2d.
Returns:	crop – Cropped image.
Return type:	np.ndarray

cadl.utils.interp(l, r, n_samples)[source]¶

Intepolate between the arrays l and r, n_samples times.

Parameters:	l (np.ndarray) – Left edge r (np.ndarray) – Right edge n_samples (int) – Number of samples
Returns:	arr – Inteporalted array
Return type:	np.ndarray

cadl.utils.linear(x, n_output, name=None, activation=None, reuse=None)[source]¶

Fully connected layer.

Parameters:	x (tf.Tensor) – Input tensor to connect n_output (int) – Number of output neurons name (None, optional) – Scope to apply activation (None, optional) – Description reuse (None, optional) – Description
Returns:	h, W – Output of fully connected layer and the weight matrix
Return type:	tf.Tensor, tf.Tensor

cadl.utils.load_audio(filename, b_normalize=True)[source]¶

Load the audiofile at the provided filename using scipy.io.wavfile.

Optionally normalizes the audio to the maximum value.

Parameters:	filename (str) – File to load. b_normalize (bool, optional) – Normalize to the maximum value.
Returns:	Description
Return type:	TYPE

cadl.utils.lrelu(features, leak=0.2)[source]¶

Leaky rectifier.

Parameters:	features (tf.Tensor) – Input to apply leaky rectifier to. leak (float, optional) – Percentage of leak.
Returns:	op – Resulting output of applying leaky rectifier activation.
Return type:	tf.Tensor

cadl.utils.make_latent_manifold(corners, n_samples)[source]¶

Create a 2d manifold out of the provided corners: n_samples * n_samples.

Parameters:	corners (list of np.ndarray) – The four corners to intepolate. n_samples (int) – Number of samples to use in interpolation.
Returns:	arr – Stacked array of all 2D interpolated samples
Return type:	np.ndarray

cadl.utils.montage(images, saveto='montage.png')[source]¶

Draw all images as a montage separated by 1 pixel borders.

Also saves the file to the destination specified by saveto.

Parameters:	images (numpy.ndarray) – Input array to create montage of. Array should be: batch x height x width x channels. saveto (str) – Location to save the resulting montage image.
Returns:	m – Montage image.
Return type:	numpy.ndarray

cadl.utils.montage_filters(W)[source]¶

Draws all filters (n_input * n_output filters) as a montage image separated by 1 pixel borders.

Parameters:	W (Tensor) – Input tensor to create montage of.
Returns:	m – Montage image.
Return type:	numpy.ndarray

cadl.utils.normalize(a, s=0.1)[source]¶

Normalize the image range for visualization

Parameters:	a (TYPE) – Description s (float, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.utils.sample_categorical(pmf)[source]¶

Sample from a categorical distribution.

Parameters:	pmf – Probablity mass function. Output of a softmax over categories. Array of shape [batch_size, number of categories]. Rows sum to 1.
Returns:	Array of size [batch_size, 1]. Integer of category sampled.
Return type:	idxs

cadl.utils.slice_montage(montage, img_h, img_w, n_imgs)[source]¶

Slice a montage image into n_img h x w images.

Performs the opposite of the montage function. Takes a montage image and slices it back into a N x H x W x C image.

Parameters:	montage (np.ndarray) – Montage image to slice. img_h (int) – Height of sliced image img_w (int) – Width of sliced image n_imgs (int) – Number of images to slice
Returns:	sliced – Sliced images as 4d array.
Return type:	np.ndarray

cadl.utils.stdout_redirect(where)[source]¶

Summary

Parameters:	where (TYPE) – Description
Yields:	TYPE – Description

cadl.utils.to_tensor(x)[source]¶

Convert 2 dim Tensor to a 4 dim Tensor ready for convolution.

Performs the opposite of flatten(x). If the tensor is already 4-D, this returns the same as the input, leaving it unchanged.

Parameters:	x (tf.Tesnor) – Input 2-D tensor. If 4-D already, left unchanged.
Returns:	x – 4-D representation of the input.
Return type:	tf.Tensor
Raises:	`ValueError` – If the tensor is not 2D or already 4D.

cadl.utils.weight_variable(shape, **kwargs)[source]¶

Helper function to create a weight variable initialized with a normal distribution

Parameters:	shape (list) – Size of weight variable **kwargs – Description
Returns:	Description
Return type:	TYPE

cadl.vae module¶

Convolutional/Variational autoencoder, including demonstration of training such a network on MNIST, CelebNet and the film, “Sita Sings The Blues” using an image pipeline.

cadl.vae.VAE(input_shape=[None, 784], n_filters=[64, 64, 64], filter_sizes=[4, 4, 4], n_hidden=32, n_code=2, activation=<function tanh>, dropout=False, denoising=False, convolutional=False, variational=False)[source]¶

(Variational) (Convolutional) (Denoising) Autoencoder.

Uses tied weights.

Parameters:

input_shape (list, optional) – Shape of the input to the network. e.g. for MNIST: [None, 784].
n_filters (list, optional) – Number of filters for each layer. If convolutional=True, this refers to the total number of output filters to create for each layer, with each layer’s number of output filters as a list. If convolutional=False, then this refers to the total number of neurons for each layer in a fully connected network.
filter_sizes (list, optional) – Only applied when convolutional=True. This refers to the ksize (height and width) of each convolutional layer.
n_hidden (int, optional) – Only applied when variational=True. This refers to the first fully connected layer prior to the variational embedding, directly after the encoding. After the variational embedding, another fully connected layer is created with the same size prior to decoding. Set to 0 to not use an additional hidden layer.
n_code (int, optional) – Only applied when variational=True. This refers to the number of latent Gaussians to sample for creating the inner most encoding.
activation (function, optional) – Activation function to apply to each layer, e.g. tf.nn.relu
dropout (bool, optional) – Whether or not to apply dropout. If using dropout, you must feed a value for ‘keep_prob’, as returned in the dictionary. 1.0 means no dropout is used. 0.0 means every connection is dropped. Sensible values are between 0.5-0.8.
denoising (bool, optional) – Whether or not to apply denoising. If using denoising, you must feed a value for ‘corrupt_prob’, as returned in the dictionary. 1.0 means no corruption is used. 0.0 means every feature is corrupted. Sensible values are between 0.5-0.8.
convolutional (bool, optional) – Whether or not to use a convolutional network or else a fully connected network will be created. This effects the n_filters parameter’s meaning.
variational (bool, optional) – Whether or not to create a variational embedding layer. This will create a fully connected layer after the encoding, if n_hidden is greater than 0, then will create a multivariate gaussian sampling layer, then another fully connected layer. The size of the fully connected layers are determined by n_hidden, and the size of the sampling layer is determined by n_code.

Returns:

model –

{: ‘cost’: Tensor to optimize. ‘Ws’: All weights of the encoder. ‘x’: Input Placeholder ‘z’: Inner most encoding Tensor (latent features) ‘y’: Reconstruction of the Decoder ‘keep_prob’: Amount to keep when using Dropout ‘corrupt_prob’: Amount to corrupt when using Denoising ‘train’: Set to True when training/Applies to Batch Normalization.

}

Return type:

dict

cadl.vae.test_celeb()[source]¶: Train an autoencoder on Celeb Net.

cadl.vae.test_mnist()[source]¶

Train an autoencoder on MNIST.

This function will train an autoencoder on MNIST and also save many image files during the training process, demonstrating the latent space of the inner most dimension of the encoder, as well as reconstructions of the decoder.

cadl.vae.test_sita()[source]¶: Train an autoencoder on Sita Sings The Blues.

cadl.vae.train_vae(files, input_shape, learning_rate=0.0001, batch_size=100, n_epochs=50, n_examples=10, crop_shape=[64, 64, 3], crop_factor=0.8, n_filters=[100, 100, 100, 100], n_hidden=256, n_code=50, convolutional=True, variational=True, filter_sizes=[3, 3, 3, 3], dropout=True, keep_prob=0.8, activation=<function relu>, img_step=100, save_step=100, ckpt_name='vae.ckpt')[source]¶

General purpose training of a (Variational) (Convolutional) Autoencoder.

Supply a list of file paths to images, and this will do everything else.

Parameters:

files (list of strings) – List of paths to images.
input_shape (list) – Must define what the input image’s shape is.
learning_rate (float, optional) – Learning rate.
batch_size (int, optional) – Batch size.
n_epochs (int, optional) – Number of epochs.
n_examples (int, optional) – Number of example to use while demonstrating the current training iteration’s reconstruction. Creates a square montage, so make sure int(sqrt(n_examples))**2 = n_examples, e.g. 16, 25, 36, ... 100.
crop_shape (list, optional) – Size to centrally crop the image to.
crop_factor (float, optional) – Resize factor to apply before cropping.
n_filters (list, optional) – Same as VAE’s n_filters.
n_hidden (int, optional) – Same as VAE’s n_hidden.
n_code (int, optional) – Same as VAE’s n_code.
convolutional (bool, optional) – Use convolution or not.
variational (bool, optional) – Use variational layer or not.
filter_sizes (list, optional) – Same as VAE’s filter_sizes.
dropout (bool, optional) – Use dropout or not
keep_prob (float, optional) – Percent of keep for dropout.
activation (function, optional) – Which activation function to use.
img_step (int, optional) – How often to save training images showing the manifold and reconstruction.
save_step (int, optional) – How often to save checkpoints.
ckpt_name (str, optional) – Checkpoints will be named as this, e.g. ‘model.ckpt’

cadl.vaegan module¶

Convolutional/Variational autoencoder, including demonstration of training such a network on MNIST, CelebNet and the film, “Sita Sings The Blues” using an image pipeline.

cadl.vaegan.VAE(input_shape=[None, 784], n_filters=[64, 64, 64], filter_sizes=[4, 4, 4], n_hidden=32, n_code=2, activation=<function tanh>, convolutional=False, variational=False)[source]¶

Summary

Parameters:	input_shape (list, optional) – Description n_filters (list, optional) – Description filter_sizes (list, optional) – Description n_hidden (int, optional) – Description n_code (int, optional) – Description activation (TYPE, optional) – Description convolutional (bool, optional) – Description variational (bool, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.vaegan.VAEGAN(input_shape=[None, 784], n_filters=[64, 64, 64], filter_sizes=[4, 4, 4], n_hidden=32, n_code=2, activation=<function tanh>, convolutional=False, variational=False)[source]¶

Summary

Parameters:	input_shape (list, optional) – Description n_filters (list, optional) – Description filter_sizes (list, optional) – Description n_hidden (int, optional) – Description n_code (int, optional) – Description activation (TYPE, optional) – Description convolutional (bool, optional) – Description variational (bool, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.vaegan.decoder(z, shapes, n_hidden=None, dimensions=[], filter_sizes=[], convolutional=False, activation=<function relu>, output_activation=<function relu>)[source]¶

Summary

Parameters:	z (TYPE) – Description shapes (TYPE) – Description n_hidden (None, optional) – Description dimensions (list, optional) – Description filter_sizes (list, optional) – Description convolutional (bool, optional) – Description activation (TYPE, optional) – Description output_activation (TYPE, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.vaegan.discriminator(x, convolutional=True, filter_sizes=[5, 5, 5, 5], activation=<function relu>, n_filters=[100, 100, 100, 100])[source]¶

Summary

Parameters:	x (TYPE) – Description convolutional (bool, optional) – Description filter_sizes (list, optional) – Description activation (TYPE, optional) – Description n_filters (list, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.vaegan.encoder(x, n_hidden=None, dimensions=[], filter_sizes=[], convolutional=False, activation=<function relu>, output_activation=<function sigmoid>)[source]¶

Summary

Parameters:	x (TYPE) – Description n_hidden (None, optional) – Description dimensions (list, optional) – Description filter_sizes (list, optional) – Description convolutional (bool, optional) – Description activation (TYPE, optional) – Description output_activation (TYPE, optional) – Description
Returns:	name – Description
Return type:	TYPE

cadl.vaegan.test_celeb(n_epochs=100, filter_sizes=[3, 3, 3, 3], n_filters=[100, 100, 100, 100], crop_shape=[100, 100, 3])[source]¶

Summary

Parameters:	n_epochs (int, optional) – Description Longer Returned (No) – ------------------ – name (TYPE) – Description

cadl.vaegan.test_sita(n_epochs=100)[source]¶

Summary

Parameters:	n_epochs (int, optional) – Description Longer Returned (No) – ------------------ – name (TYPE) – Description

cadl.vaegan.train_vaegan(files, learning_rate=1e-05, batch_size=64, n_epochs=250, n_examples=10, input_shape=[218, 178, 3], crop_shape=[64, 64, 3], crop_factor=0.8, n_filters=[100, 100, 100, 100], n_hidden=None, n_code=128, convolutional=True, variational=True, filter_sizes=[3, 3, 3, 3], activation=<function elu>, ckpt_name='vaegan.ckpt')[source]¶

Summary

Parameters:

files (TYPE) – Description
learning_rate (float, optional) – Description
batch_size (int, optional) – Description
n_epochs (int, optional) – Description
n_examples (int, optional) – Description
input_shape (list, optional) – Description
crop_shape (list, optional) – Description
crop_factor (float, optional) – Description
n_filters (list, optional) – Description
n_hidden (int, optional) – Description
n_code (int, optional) – Description
convolutional (bool, optional) – Description
variational (bool, optional) – Description
filter_sizes (list, optional) – Description
activation (TYPE, optional) – Description
ckpt_name (str, optional) – Description
Longer Returned (No) –
------------------ –
name (TYPE) – Description

cadl.vaegan.variational_bayes(h, n_code)[source]¶

Summary

Parameters:	h (TYPE) – Description n_code (TYPE) – Description
Returns:	name – Description
Return type:	TYPE

cadl.vctk module¶

VCTK Dataset download and preprocessing.

cadl.vctk.batch_generator(dataset, batch_size=32, max_sequence_length=6144, maxval=32768.0, threshold=0.2, normalize=True)[source]¶

Summary

Parameters:	dataset (TYPE) – Description batch_size (int, optional) – Description max_sequence_length (int, optional) – Description maxval (float, optional) – Description threshold (float, optional) – Description normalize (bool, optional) – Description
Yields:	TYPE – Description

cadl.vctk.get_dataset(saveto='vctk', convert_to_16khz=False)[source]¶

Download the VCTK dataset and convert to wav files.

More info:: http://homepages.inf.ed.ac.uk/jyamagis/ page3/page58/page58.html

This interface downloads the VCTK dataset and attempts to convert the flac to wave files using ffmpeg. If you do not have ffmpeg installed, this function will not be able to convert the files to waves.

Parameters:	saveto (str) – Directory to save the resulting dataset [‘vctk’] convert_to_16khz (bool, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.vgg16 module¶

VGG16 pretrained model and VGG Face model.

cadl.vgg16.deprocess(img)[source]¶

Summary

Parameters:	img (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.vgg16.get_vgg_face_model()[source]¶

Summary

Returns:	Description
Return type:	TYPE

cadl.vgg16.get_vgg_model()[source]¶

Summary

Returns:	Description
Return type:	TYPE

cadl.vgg16.preprocess(img, crop=True, resize=True, dsize=(224, 224))[source]¶

Summary

Parameters:	img (TYPE) – Description crop (bool, optional) – Description resize (bool, optional) – Description dsize (tuple, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.vgg16.test_vgg()[source]¶: Loads the VGG network and applies it to a test image.

cadl.vgg16.test_vgg_face()[source]¶: Loads the VGG network and applies it to a test image.

cadl.wavenet module¶

WaveNet Autoencoder and conditional WaveNet.

cadl.wavenet.condition(x, encoding)[source]¶

Summary

Parameters:	x (TYPE) – Description encoding (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.wavenet.create_wavenet(n_stages=10, n_layers_per_stage=9, n_hidden=200, batch_size=32, n_skip=100, filter_length=2, shift=True, n_quantization=256, sample_rate=16000)[source]¶

Summary

Parameters:	n_stages (int, optional) – Description n_layers_per_stage (int, optional) – Description n_hidden (int, optional) – Description batch_size (int, optional) – Description n_skip (int, optional) – Description filter_length (int, optional) – Description shift (bool, optional) – Description n_quantization (int, optional) – Description sample_rate (int, optional) – Description
Returns:	Description
Return type:	TYPE

cadl.wavenet.create_wavenet_autoencoder(n_stages, n_layers_per_stage, n_hidden, batch_size, n_skip, filter_length, bottleneck_width, hop_length, n_quantization, sample_rate)[source]¶

Summary

Parameters:	n_stages (TYPE) – Description n_layers_per_stage (TYPE) – Description n_hidden (TYPE) – Description batch_size (TYPE) – Description n_skip (TYPE) – Description filter_length (TYPE) – Description bottleneck_width (TYPE) – Description hop_length (TYPE) – Description n_quantization (TYPE) – Description sample_rate (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.wavenet.get_sequence_length(n_stages, n_layers_per_stage)[source]¶

Summary

Parameters:	n_stages (TYPE) – Description n_layers_per_stage (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.wavenet.test_librispeech()[source]¶: Summary

cadl.wavenet.train_vctk()[source]¶

Summary

Returns:	Description
Return type:	TYPE

cadl.wavenet_utils module¶

Various utilities for training WaveNet.

cadl.wavenet_utils.batch_to_time(X, block_size)[source]¶

Inverse of time_to_batch(X, block_size).

Parameters:	X – Tensor of shape [nbblock_size, k, n] for some natural number k. block_size* – number of time steps (i.e. size of dimension 1) in the output tensor.
Returns:
Return type:	Tensor of shape [nb, k*block_size, n]

cadl.wavenet_utils.causal_linear(X, n_inputs, n_outputs, name, filter_length, rate, batch_size, depth=1)[source]¶

Applies dilated convolution using queues. Assumes a filter_length of 2 or 3.

Parameters:

X – The [mb, time, channels] tensor input.
n_inputs – The input number of channels.
n_outputs – The output number of channels.
name – The variable scope to provide to W and biases.
filter_length – The length of the convolution, assumed to be 3.
rate – The rate or dilation
batch_size – Non-symbolic value for batch_size.
depth (int, optional) – Description

Returns:

y – The output of the operation
(init_1, init_2) – Initialization operations for the queues
(push_1, push_2) – Push operations for the queues

cadl.wavenet_utils.conv1d(X, num_filters, filter_length, name, dilation=1, causal=True, kernel_initializer=<tensorflow.python.ops.init_ops.UniformUnitScaling object>, biases_initializer=<tensorflow.python.ops.init_ops.Constant object>)[source]¶

Fast 1D convolution that supports causal padding and dilation.

Parameters:	X – The [mb, time, channels] float tensor that we convolve. num_filters – The number of filter maps in the convolution. filter_length – The integer length of the filter. name – The name of the scope for the variables. dilation – The amount of dilation. causal – Whether or not this is a causal convolution. kernel_initializer – The kernel initialization function. biases_initializer – The biases initialization function.
Returns:	The output of the 1D convolution.
Return type:	y

cadl.wavenet_utils.inv_mu_law(X, mu=255)[source]¶

A TF implementation of inverse Mu-Law.

Parameters:	X – The Mu-Law samples to decode. mu – The Mu we used to encode these samples.
Returns:	The decoded data.
Return type:	out

cadl.wavenet_utils.inv_mu_law_numpy(X, mu=255.0)[source]¶

A numpy implementation of inverse Mu-Law.

Parameters:	X – The Mu-Law samples to decode. mu – The Mu we used to encode these samples.
Returns:	The decoded data.
Return type:	out

cadl.wavenet_utils.linear(X, n_inputs, n_outputs, name)[source]¶

Summary

Parameters:	X (TYPE) – Description n_inputs (TYPE) – Description n_outputs (TYPE) – Description name (TYPE) – Description
Returns:	Description
Return type:	TYPE

cadl.wavenet_utils.mu_law(X, mu=255, int8=False)[source]¶

A TF implementation of Mu-Law encoding.

Parameters:	X – The audio samples to encode. mu – The Mu to use in our Mu-Law. int8 – Use int8 encoding.
Returns:	The Mu-Law encoded int8 data.
Return type:	out

cadl.wavenet_utils.mu_law_numpy(X, mu=255, int8=False)[source]¶

A TF implementation of Mu-Law encoding.

Parameters:	X – The audio samples to encode. mu – The Mu to use in our Mu-Law. int8 – Use int8 encoding.
Returns:	The Mu-Law encoded int8 data.
Return type:	out

cadl.wavenet_utils.mul_or_none(a, b)[source]¶

Return the element wise multiplicative of the inputs. If either input is None, we return None.

Parameters:	a – A tensor input. b – Another tensor input with the same type as a.
Returns:
Return type:	None if either input is None. Otherwise returns a * b.

cadl.wavenet_utils.pool1d(X, window_length, name, mode='avg', stride=None)[source]¶

1D pooling function that supports multiple different modes.

Parameters:	X – The [mb, time, channels] float tensor that we are going to pool over. window_length – The amount of samples we pool over. name – The name of the scope for the variables. mode – The type of pooling, either avg or max. stride – The stride length.
Returns:	The [mb, time // stride, channels] float tensor result of pooling.
Return type:	pooled

cadl.wavenet_utils.shift_right(X)[source]¶

Shift the input over by one and a zero to the front.

Parameters:	X – The [mb, time, channels] tensor input.
Returns:	The [mb, time, channels] tensor output.
Return type:	x_sliced

cadl.wavenet_utils.time_to_batch(X, block_size)[source]¶

Splits time dimension (i.e. dimension 1) of X into batches. Within each batch element, the k*block_size time steps are transposed, so that the k time steps in each output batch element are offset by block_size from each other. The number of input time steps must be a multiple of block_size.

Parameters:	X – Tensor of shape [nb, kblock_size, n] for some natural number k. block_size* – number of time steps (i.e. size of dimension 1) in the output tensor.
Returns:
Return type:	Tensor of shape [nb*block_size, k, n]

cadl.word2vec module¶

Word2Vec model.

cadl.word2vec.build_model(batch_size=128, vocab_size=50000, embedding_size=128, n_neg_samples=64)[source]¶

Summary

Parameters:	batch_size (int, optional) – Description vocab_size (int, optional) – Description embedding_size (int, optional) – Description n_neg_samples (int, optional) – Description
Returns:	Description
Return type:	TYPE

Module contents¶

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.