Phase transition in sequence unique reconstruction

Li Xia; Chan Zhou

doi:10.1007/s11424-007-9001-x

Phase transition in sequence unique reconstruction

Li Xia, Chan Zhou

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

In this paper, sequence unique reconstruction refers to the property that a sequence is uniquely reconstructable from all its K-tuples. We propose and study the phase transition behavior of the probability P(K) of unique reconstruction with regard to tuple size K in random sequences (iid model). Based on Monte Carlo experiments, artificial proteins generated from iid model exhibit a phase transition when P(K) abruptly jumps from a low value phase (e.g. < 0.1) to a high value phase (e.g. > 0.9). With a generalization to any alphabet, we prove that for a random sequence of length L, as L is large enough, P(K) undergoes a sharp phase transition when ple; 0.1015 where p = P (two random letters match). Besides, formulas are derived to estimate the transition points, which may be of practical use in sequencing DNA by hybridization. Concluded from our study, most proteins do not deviate greatly from random sequences in the sense of sequence unique reconstruction, while there are some "stubborn" proteins which only become uniquely reconstructable at a very large K and probably have biological implications.

Original language	English (US)
Pages (from-to)	18-29
Number of pages	12
Journal	Journal of Systems Science and Complexity
Volume	20
Issue number	1
DOIs	https://doi.org/10.1007/s11424-007-9001-x
State	Published - Mar 2007
Externally published	Yes

Keywords

Phase transition
Probability
Protein sequence
SBH
Unique reconstruction

ASJC Scopus subject areas

Computer Science (miscellaneous)
Information Systems

Access to Document

10.1007/s11424-007-9001-x

Cite this

@article{3b29f753f1f342fc85f39019f17fd988,

title = "Phase transition in sequence unique reconstruction",

abstract = "In this paper, sequence unique reconstruction refers to the property that a sequence is uniquely reconstructable from all its K-tuples. We propose and study the phase transition behavior of the probability P(K) of unique reconstruction with regard to tuple size K in random sequences (iid model). Based on Monte Carlo experiments, artificial proteins generated from iid model exhibit a phase transition when P(K) abruptly jumps from a low value phase (e.g. < 0.1) to a high value phase (e.g. > 0.9). With a generalization to any alphabet, we prove that for a random sequence of length L, as L is large enough, P(K) undergoes a sharp phase transition when ple; 0.1015 where p = P (two random letters match). Besides, formulas are derived to estimate the transition points, which may be of practical use in sequencing DNA by hybridization. Concluded from our study, most proteins do not deviate greatly from random sequences in the sense of sequence unique reconstruction, while there are some {"}stubborn{"} proteins which only become uniquely reconstructable at a very large K and probably have biological implications.",

keywords = "Phase transition, Probability, Protein sequence, SBH, Unique reconstruction",

author = "Li Xia and Chan Zhou",

year = "2007",

month = mar,

doi = "10.1007/s11424-007-9001-x",

language = "English (US)",

volume = "20",

pages = "18--29",

journal = "Journal of Systems Science and Complexity",

issn = "1009-6124",

publisher = "Springer New York",

number = "1",

}

TY - JOUR

T1 - Phase transition in sequence unique reconstruction

AU - Xia, Li

AU - Zhou, Chan

PY - 2007/3

Y1 - 2007/3

N2 - In this paper, sequence unique reconstruction refers to the property that a sequence is uniquely reconstructable from all its K-tuples. We propose and study the phase transition behavior of the probability P(K) of unique reconstruction with regard to tuple size K in random sequences (iid model). Based on Monte Carlo experiments, artificial proteins generated from iid model exhibit a phase transition when P(K) abruptly jumps from a low value phase (e.g. < 0.1) to a high value phase (e.g. > 0.9). With a generalization to any alphabet, we prove that for a random sequence of length L, as L is large enough, P(K) undergoes a sharp phase transition when ple; 0.1015 where p = P (two random letters match). Besides, formulas are derived to estimate the transition points, which may be of practical use in sequencing DNA by hybridization. Concluded from our study, most proteins do not deviate greatly from random sequences in the sense of sequence unique reconstruction, while there are some "stubborn" proteins which only become uniquely reconstructable at a very large K and probably have biological implications.

AB - In this paper, sequence unique reconstruction refers to the property that a sequence is uniquely reconstructable from all its K-tuples. We propose and study the phase transition behavior of the probability P(K) of unique reconstruction with regard to tuple size K in random sequences (iid model). Based on Monte Carlo experiments, artificial proteins generated from iid model exhibit a phase transition when P(K) abruptly jumps from a low value phase (e.g. < 0.1) to a high value phase (e.g. > 0.9). With a generalization to any alphabet, we prove that for a random sequence of length L, as L is large enough, P(K) undergoes a sharp phase transition when ple; 0.1015 where p = P (two random letters match). Besides, formulas are derived to estimate the transition points, which may be of practical use in sequencing DNA by hybridization. Concluded from our study, most proteins do not deviate greatly from random sequences in the sense of sequence unique reconstruction, while there are some "stubborn" proteins which only become uniquely reconstructable at a very large K and probably have biological implications.

KW - Phase transition

KW - Probability

KW - Protein sequence

KW - SBH

KW - Unique reconstruction

UR - http://www.scopus.com/inward/record.url?scp=33847304097&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33847304097&partnerID=8YFLogxK

U2 - 10.1007/s11424-007-9001-x

DO - 10.1007/s11424-007-9001-x

M3 - Article

AN - SCOPUS:33847304097

SN - 1009-6124

VL - 20

SP - 18

EP - 29

JO - Journal of Systems Science and Complexity

JF - Journal of Systems Science and Complexity

IS - 1

ER -

Phase transition in sequence unique reconstruction

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this