Graduation Year

1507346220

Document Type

Thesis

Degree

M.A.

Degree Name

Master of Arts (M.A.)

Degree Granting Department

Mathematics and Statistics

Major Professor

Natasa Jonoska, Ph.D.

Co-Major Professor

Masahiko Saito, Ph.D.

Committee Member

Dmytro Savchuk, Ph.D.

Keywords

Reduction operations, Double occurrence words, Pattern indices, Nesting index, Ciliates

Abstract

Patterns, sequences of variables, have traditionally only been studied when morphic images of them appear as factors in words. In this thesis, we initiate a study of patterns in words that appear as subwords of words. We say that a pattern appears in a word if each pattern variable can be morphically mapped to a factor in the word. To gain insight into the complexity of, and similarities between, words, we define pattern indices and distances between two words relative a given set of patterns. The distance is defined as the minimum number of pattern insertions and/or removals that transform one word into another. The pattern index is defined as the minimum number of pattern removals that transform a given word into the empty word. We initially consider pattern distances between arbitrary words. We conjecture that the word distance is computable relative the pattern αα and prove a lemma in this direction. Motivated by patterns detected in certain scrambled ciliate genomes, we focus on double occurrence words (words where every symbol appears twice) and consider recursive patterns, a generalization of the notion of a pattern which includes new types of words. We show that in double occurrence words the distance relative so-called complete sets of recursive patterns is computable. In particular, the pattern distance relative patterns αα (repeat words) and ααR (return words) is computable for double occurrence words. We conclude by applying pattern indices and word distances towards the analysis of highly scrambled genes in O. trifallax and discover a common pattern.

Included in

Mathematics Commons

Share

COinS