# Counting distinct strings

Moore, D., Smyth, W.F. and Miller, D.
(1999)
*Counting distinct strings.*
Algorithmica, 23
(1).
pp. 1-13.

*Subscription may be required

## Abstract

This paper discusses how to count and generate strings that are ``distinct'' in two senses: p -distinct and b -distinct. Two strings x on alphabet A and x' on alphabet A' are said to be p -distinct iff they represent distinct ``patterns''; that is, iff there exists no one—one mapping from A to A' that transforms x into x' . Thus aab and baa are p -distinct while aab and ddc are p -equivalent. On the other hand, x and x' are said to be b -distinct iff they give rise to distinct border (failure function) arrays: thus aab with border array 010 is b -distinct from aba with border array 001 . The number of p -distinct (resp. b -distinct) strings of length n formed using exactly k different letters is the [k,n] entry in an infinite p' (resp. b' ) array. Column sums p[n] and b[n] in these arrays give the number of distinct strings of length n . We present algorithms to compute, in constant time per string, all p -distinct (resp. b -distinct) strings of length n formed using exactly k letters, and we also show how to compute all elements p'[k,n] and b'[k,n] . These ideas and results have application to the efficient generation of appropriate test data sets for many string algorithms.

Item Type: | Journal Article |
---|---|

Publisher: | Springer Verlag |

URI: | http://researchrepository.murdoch.edu.au/id/eprint/27564 |

Item Control Page |

## Downloads

Downloads per month over past year