Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

Counting distinct strings

Moore, D., Smyth, W.F. and Miller, D. (1999) Counting distinct strings. Algorithmica, 23 (1). pp. 1-13.

PDF - Authors' Version
Download (226kB)
Link to Published Version:
*Subscription may be required


This paper discusses how to count and generate strings that are ``distinct'' in two senses: p -distinct and b -distinct. Two strings x on alphabet A and x' on alphabet A' are said to be p -distinct iff they represent distinct ``patterns''; that is, iff there exists no one—one mapping from A to A' that transforms x into x' . Thus aab and baa are p -distinct while aab and ddc are p -equivalent. On the other hand, x and x' are said to be b -distinct iff they give rise to distinct border (failure function) arrays: thus aab with border array 010 is b -distinct from aba with border array 001 . The number of p -distinct (resp. b -distinct) strings of length n formed using exactly k different letters is the [k,n] entry in an infinite p' (resp. b' ) array. Column sums p[n] and b[n] in these arrays give the number of distinct strings of length n . We present algorithms to compute, in constant time per string, all p -distinct (resp. b -distinct) strings of length n formed using exactly k letters, and we also show how to compute all elements p'[k,n] and b'[k,n] . These ideas and results have application to the efficient generation of appropriate test data sets for many string algorithms.

Item Type: Journal Article
Publisher: Springer Verlag
Item Control Page Item Control Page


Downloads per month over past year