Generating Sound by a Combination of Markov Chains and N-Grams
This project uses a combination of Markov chains and n-grams to create new .wav files from input .wav files. The program works by disassembling the input .wav file, taking the list of samples in the file and creating a list of corresponding n-grams, building a binary tree out of the n-grams, and then using a Markov chain to construct a new sequence of samples which are then turned into a new .wav file.
{{{id=1| #Andrew Richman #Math480 - Spring 2011 import random import numpy as np import scipy from scipy.io import wavfile #The Ngram class represents an ngram, which takes a list of n-1 elements class Ngram: def __init__(self,lst): self.gram = lst self.count = 0 self.completionList = [] def __repr__(self): return self.gram.__repr__() +", " + self.count.__repr__() +", " + self.completionList.__repr__() #Adds a completion to the current Ngram def add(self, gram): notFound = true if len(self.completionList)==0: self.completionList+=[gram] self.count+=gram[0] else: for i in range(len(self.completionList)): if self.completionList[i][1]==gram[1]: notFound = false self.completionList[i]=(self.completionList[i][0]+gram[0],gram[1]) self.count+=gram[0] break if notFound: self.completionList += [gram] self.count+=gram[0] #Returns a randomly chosen completion for the current Ngram from the weighted #completion list def getCompletion(self): x = random.randint(1,self.count) for i in range(len(self.completionList)): if x - self.completionList[i][0] <= 0: return self.completionList[i][1] else: x-=self.completionList[i][0] #The NgramNode class is a node for the binary tree that stores the Ngrams class NgramNode: def __init__(self, ngnode): self.node = ngnode self.right = None self.left = None #Adds a child to the current NgramNode def addChild(self, child): if child.gram > self.node.gram: if self.right!=None: self.right.addChild(child) else: self.right=NgramNode(child) elif child.gram < self.node.gram: if self.left!=None: self.left.addChild(child) else: self.left=NgramNode(child) else: self.node.add(child.completionList[0]) #Finds an Ngram in the given tree from a list of n elements #Raises an exception if no Ngram can be found matching the given list #(this case will only happen if the final sequence of elements is #not repeated anywhere else) def findNgram(ngnode, ngram): if ngram==ngnode.node.gram: return ngnode.node elif (ngnode.right != None) and (ngnode.node.gram < ngram): return findNgram(ngnode.right, ngram) elif ngnode.left != None: return findNgram(ngnode.left, ngram) else: raise Exception("No completion could be found") #Takes a list of elements and returns a list of Ngrams #Input is the array and the size of the grams #If n is less than 1 a ValueError is raised def processArray(arr,n): if n <= 1: raise ValueError ngrams = [] for i in range(len(arr)-n): ng = Ngram(arr[i:n+i]) ng.completionList+=[(1,arr[n+i])] ng.count+=1 ngrams+=[ng] return ngrams #Constructs a binary tree from a list of Ngrams def buildTree(ngrams): root = NgramNode(ngrams[0]) for i in range(1,len(ngrams)): root.addChild(ngrams[i]) return root #Uses the Markov chain method to construct a new sequence of elements using #a binary tree of Ngram nodes, an initial Ngram, and a length limit #Raises a value error if root or seed are null or if length is less than or equal to 0 def buildChain(root, seed, length): if root==None or seed == None or length<=0: raise ValueError markovChain = copy(seed.gram) curGram = seed for i in range(length-len(seed.gram)): markovChain+=[curGram.getCompletion()] curGram=findNgram(root, markovChain[i+1:]) return markovChain /// }}}The first example is a small clip of classical music. This file is a recording of a piano and therefore the samples have vary more than a synthesized sound, and the composition of the piece is not repetitive. Making n-grams any larger than 2 sequences causes the new .wav file to be a replica because there is so much variation that there are no long repeated patterns.
{{{id=2| exampleOne=wavfile.read(DATA+'bach22050mono.wav') #a tuple of sample rate and an array of ints info = processArray(exampleOne[1].tolist(),2) #create the ngrams tree = buildTree(info) #build the ngram tree data = buildChain(tree, info[1],22050*3) #create the new sequence using a Markov chain wavfile.write('bachchain2.wav',exampleOne[0],np.array(data,np.int16)) #create the new .wav file with this sequence /// }}}This .wav file uses a synthesized sound that is repeated, meaning that it contains more repeated patterns than the classical music example. The resulting product is less random and it is more obvious that it came from the original file.
{{{id=3| exampleTwo=wavfile.read(DATA+'aphex22050mono.wav') #a tuple of sample rate and an array of ints info = processArray(exampleTwo[1].tolist(),2) #create the ngrams tree = buildTree(info) #build the ngram tree data = buildChain(tree, info[1],22050*3) #create the new sequence using a Markov chain wavfile.write('aphexchain2.wav',exampleTwo[0],np.array(data,np.int16)) #create the new .wav file with this sequence /// }}}This .wav file is a voice recording.
{{{id=7| exampleThree=wavfile.read(DATA+'gman22050mono.wav') #a tuple of sample rate and an array of ints info = processArray(exampleThree[1].tolist(),2) #create the ngrams tree = buildTree(info) #build the ngram tree data = buildChain(tree, info[1],22050*3) #create the new sequence using a Markov chain wavfile.write('gmanchain2.wav',exampleThree[0],np.array(data,np.int16)) #create the new .wav file with this sequence /// }}}Using Markov chains to generate text vs .wav files:
The range of possible states in a 2-gram 16 bit .wav file is 32768 (2^(16-1)), while the range of possible states in a 2-gram chain of War and Peace is 28723~ (the number of unique words found by using the linux terminal command: tr -s '[:blank:]' '\n' < wnptext.txt | tr -d '[:punct:]' | sort | uniq | wc -l).
The English language has grammatical patterns while the samples are produced by a combination of functions which are or are not continuous. The way the functions are defined makes them deterministic meaning less variety and therefore less interesting results come from Markov chain generation. English is less deterministic and because of this produces slightly more interesting results from Markov chain generation.
{{{id=13| /// }}}