2010-09-14 : Every time I try to read the Old Testament, I'm always amazed at how repetitive
it is. That me to wonder if I could measure this repetitiveness. What is the longest,
most repeated phrase in the Bible? Note that this is an ambiguous question, and part of this
exercise is to figure out how to resolve the ambiguity.
I'd previously done letter and word counters with Daniel Richard G., but this time I wanted
to calculate word sequences. Also, the Bible is moderately large - over 790,000 words.
That's nothing if you are a computational biologist,
but naive O(n2) or worse algorithms aren't going to cut it.
It seemed to be that the "sorting an array of all suffix strings" approach would work pretty
well for making it easy to find duplicate strings. However, I also wanted to parameterize my
approach so I could use the same algorithm but vary things like how punctuation was handled and
whether comparison was done on a character by character or word by word basis.
Side note: Interestingly enough, there's a word for creating a sorted index
of every word in a book: a
concordance.
Even more interesting, the first concordance was created for . . . the Bible!
I started out by making a "histogram" of sequences of tokens of length n. Tokens
can be single characters or words (character array ranges), and I can change the tokenizer
to get different results.
The Results
Source:
https://latenighthacking.com:8443/svn/code_root/2010/WordPlay
Speed: For the Bible (below) (4,137,819 bytes):
For lower case words:
- Loading: 0.344s, tokenizing: 0.843s, sorting: 4.078s.
- Calculating the top 10 sequences for a given length takes about 0.6s and gets faster
for longer sequences as the non-repeated sequences are incrementally pruned.
- Memory usage: roughly 70MB
For characters :
- Loading: 0.344s, tokenizing: 2.828s, sorting: 35.657s.
- Calculating the top 10 sequences for a given length takes about 4.1s and gets faster.
- Memory usage: roughly 400MB.
Not too bad. Good enough for interactive experimentation.
The Bible
My copy of the Bible comes from O-Bible, King
James Version. I'm using the whole thing, not just the Old Testament.
Character histogram (lower case) of 4,137,819 characters:
" ":758,536, "e":410,139, "t":316,031, "h":282,012, "a":274,632, "o":241,643,
"n":223,958, "i":192,750, "s":189,132, "r":169,113, "d":157,546, "l":129,348,
"f":83,095, "u":82,940, "m":79,542, ",":70,683, "w":65,213, "y":58,248, "g":54,852, "c":54,430,
"b":48,555, "p":42,738, "\n":31,102, "v":30,249, ".":26,145, "k":22,111, ":":12,721, ";":10,139,
"j":8,778, "?":3,297, "z":2,966, "'":1,997, "x":1,449, "q":953, "!":313, "(":221, ")":221, "-":21
Here are some interesting results, just looking at 791,421 (lower case) words.
- Top 10 sequences of length 1: "the":63,919, "and":51,696, "of":34,618, "to":13,560,
"that":12,915, "in":12,667, "he":10,420, "shall":98,37, "unto":89,98, "for":8,971
- Top 10 sequences of length 2: "of the":11,528, "the lord":7,035, "and the":6,268,
"in the":5,030, "and he":2,791, "shall be":2,460, "all the":2,144, "to the":2,142,
"and they":2,086, "unto the":2,032
- Top 10 sequences of length 3: "of the lord":1,775, "the son of":1,451, "the children of":1,355,
"the house of":883, "saith the lord":854, "the lord and":816, "out of the":805, "and i will":672,
"children of israel":647, "the land of":616
- Top 10 sequences of length 4: "the children of israel":638, "it came to pass":453,
"thus saith the lord":415, "and it came to":396, "of the children of":374,
"of the lord and":330, "the lord thy god":304, "the house of the":279, "the word of the":266,
"word of the lord":258
- Top 10 sequences of length 5: "and it came to pass":396, "the word of the lord":258,
"the house of the lord":234, "of the children of israel":166, "thus saith the lord god":162,
"it came to pass when":142, "the tabernacle of the congregation":133,
"the children of israel and":128, "and the lord said unto":124, "saith the lord of hosts":123
- Top 10 sequences of length 6: "and it came to pass when":125, "and it shall come to pass":102,
"the house of the lord and":100, "and the lord spake unto moses":99,
"the word of the lord came":92, "out of the land of egypt":82, "in the sight of the lord":80,
"know that i am the lord":77, "the lord spake unto moses saying":74,
"thus saith the lord of hosts":70
- Top 10 sequences of length 7: "and the lord spake unto moses saying":72,
"the word of the lord came unto":63, "shall know that i am the lord":57,
"the door of the tabernacle of the":47, "word of the lord came unto me":46,
"door of the tabernacle of the congregation":45, "evil in the sight of the lord":44,
"of the lord came unto me saying":42, "of the tabernacle of the congregation and":37,
"and it shall come to pass that":36
- Top 10 sequences of length 8: "the word of the lord came unto me":46,
"the door of the tabernacle of the congregation":45, "word of the lord came unto me saying":42,
"book of the chronicles of the kings of":34, "in the book of the chronicles of the":34,
"the book of the chronicles of the kings":34, "the lord of hosts the god of israel":34,
"written in the book of the chronicles of":34, "are they not written in the book of":33,
"and the lord spake unto moses saying speak":32
- Top 10 sequences of length 9: "the word of the lord came unto me saying":42,
"in the book of the chronicles of the kings":34, "the book of the chronicles of the kings of":34,
"written in the book of the chronicles of the":34,
"and the lord spake unto moses saying speak unto":31,
"are they not written in the book of the":31, "of the lord came unto me saying son of":31,
"the lord came unto me saying son of man":31, "word of the lord came unto me saying son":31,
"saith the lord of hosts the god of israel":30
- Top 10 sequences of length 10: "in the book of the chronicles of the kings of":34,
"written in the book of the chronicles of the kings":34,
"of the lord came unto me saying son of man":31,
"the word of the lord came unto me saying son":31,
"word of the lord came unto me saying son of":31,
"are they not written in the book of the chronicles":29,
"not written in the book of the chronicles of the":29,
"they not written in the book of the chronicles of":29,
"thus saith the lord of hosts the god of israel":29,
"and the lord spake unto moses saying speak unto the":22
While the results are correct, as we get to longer sequences, the result is not exactly
what I want. We see the same phrase hits the top 10 multiple times, probably pushing off
other interesting phrases. It's a sliding window problem: if the phrase is actually 15 words long,
it will show up five times in the top ten as the window slides along. I would prefer
it show up only once.
There is also the opposite problem, where a core phrase has slightly
different words at the beginning or end. They really are "different" phrases though they are very
similar. For example, from the above:
"in the book of the chronicles of the kings of":34
"not written in the book of the chronicles of the":29
Even though they share the same core, they have different numbers of occurrences.
Other interesting long sequences:
- length 11: "written in the book of the chronicles of the kings of":34
- length 12: "the word of the lord came unto me saying son of man":31
- length 13: "and he did that which was evil in the sight of the lord":17
- length 14: "are they not written in the book of the chronicles of the kings of":29
- length 16: "after their families by the house of their fathers according to the number of the names":12,
- length 26: "from twenty years old and upward all that were able to go forth to war those that were numbered of them even of the tribe of":12
And to really bring the sliding window problem home,
- length 99: "his offering was one silver charger the weight whereof was an hundred and thirty shekels one silver bowl of seventy shekels after the shekel of the sanctuary both of them full of fine flour mingled with oil for a meat offering one golden spoon of ten shekels full of incense one young bullock one ram one lamb of the first year for a burnt offering one kid of the goats for a sin offering and for a sacrifice of peace offerings two oxen five rams five he goats five lambs of the first year this was the offering of":7
The longest phrase used more than once is just a variation of the above that prepends two words: length 101: "did offer his offering...":2.
The Qur'an
So there are lots of translations and clearly the results are going to depend on the
translation. I've fed through a couple of translations, and the general feel of the results
remains the same (thankfully).
Using the Pickthall translation
(source),
which has fairly Biblical English,
here are some interesting results. Looking at 155,833 (lower case) words:
- Top 10 sequences of length 1: "and":7,871, "the":7,638, "of":4,857, "is":3,185, "they":2,882,
"allah":2,742, "that":2,552, "them":2,322, "a":2,282, "he":2,259
- Top 10 sequences of length 2: "of the":1,102, "those who":872, "and the":737, "in the":712,
"that which":558, "allah is":444, "will be":420, "it is":417, "is the":397, "of allah":385
Interestingly, by length 3 we are already under 200 occurrences:
- Top 10 sequences of length 3: "the heavens and":168, "and the earth":157,
"heavens and the":135, "those who believe":134, "those who disbelieve":133, "lo allah is":132,
"of those who":127, "in the earth":123, "on the day":118, "for those who":115
By length 6 we are already under 100 occurrences:
- Top 10 sequences of length 6: "of the heavens and the earth":63,
"the heavens and the earth and":52, "who believe and do good works":41,
"those who believe and do good":40, "is able to do all things":33,
"favours of your lord that ye":31, "is it of the favours of":31,
"it of the favours of your":31, "of the favours of your lord":31, "of your lord that ye deny":31
- length 24: "lo herein is indeed a portent yet most of them are not believers and lo thy lord he is indeed the mighty the merciful":8,
- length 33: "said unto them will ye not ward off evil lo i am a faithful messenger unto you so keep your duty to allah and obey me and i ask of you no wage":5
Longest phrase used more than once:
- length 46: "said unto them will ye not ward off evil lo i am a faithful messenger unto you so keep your duty to allah and obey me and i ask of you no wage therefor my wage is the concern only of the lord of the worlds":4
Well, what about the original Arabic? Well, sure, the program can handle it, assuming a
space is a reasonable word break. I don't know anything about Arabic, Arabic script, or
it's Unicode encoding (other than that it's a right-to-left language and requires
complex context-sensitive glyph shaping rules to render correctly) so I'm likely mangling things.
I can't exactly read the results. Plus, if we really want
to compare originals, then I'd need to find some original Aramaic, Hebrew, and Greek for the Bible.
But all we really care about is the fun of seeing the numbers, not the exegesis, right? Right!
Looking at 281,694 words:
- Top 10 sequences of length 1:
"م":24,971, "ل":20,017, "ن":18,291,
"و":14,616, "ي":14,562, "ا":13,656,
"ه":13,226, "ر":10,364, "ك":9,993,
"ب":9,924
- Top 10 sequences of length 2:
"م ن":5,019, "ه م":4,219,
"م ا":3,135, "ع ل":2,825,
"ل ا":2,717, "الل ه":2,667,
"ك م":2,562, "إ ن":2,409,
"و ل":2,207, "ن ا":2,128
- Top 10 sequences of length 3:
"ال ذ ين":999, "ل ه م":799,
"و ل ا":761, "ع ل ي":756,
"ع ل ى":725, "ل ا ي":720,
"و م ا":712, "إ ل ا":666,
"ن ه م":665, "ل ك م":621
- Top 10 sequences of length 4:
"ال أ ر ض":444,
"ع ل ي ه":442,
"ل ي ه م":307,
"إ ن الل ه":263,
"و ل ا ت":248,
"ذ ين آم ن":242,
"ين آم ن وا":242,
"م ؤ م ن":230,
"ل ي ك م":228,
"ع ل ي ك":225
- Top 10 sequences of length 5:
"ع ل ي ه م":244,
"ذ ين آم ن وا":242,
"ال ذ ين آم ن":224,
"م ن ق ب ل":201,
"ي ال أ ر ض":185,
"ف ي ال أ ر":183,
"ع ل ي ك م":178,
"الس م او ات و":172,
"ين ك ف ر وا":172,
"ذ ين ك ف ر":171
- Top 10 sequences of length 6:
"ال ذ ين آم ن وا":224,
"ف ي ال أ ر ض":180,
"ذ ين ك ف ر وا":171,
"ال ذ ين ك ف ر":157,
"ي ا أ ي ه ا":142,
"ات و ال أ ر ض":133,
"الس م او ات و ال":133,
"او ات و ال أ ر":133,
"م او ات و ال أ":133,
"الر ح م ن الر ح":118
- Top 10 sequences of length 7:
"ال ذ ين ك ف ر وا":157,
"الس م او ات و ال أ":133,
"او ات و ال أ ر ض":133,
"م او ات و ال أ ر":133,
"الر ح م ن الر ح يم":118,
"الل ه الر ح م ن الر":114,
"ب س م الل ه الر ح":114,
"س م الل ه الر ح م":114,
"م الل ه الر ح م ن":114,
"ه الر ح م ن الر ح":114
- Top 10 sequences of length 8:
"الس م او ات و ال أ ر":133,
"م او ات و ال أ ر ض":133,
"الل ه الر ح م ن الر ح":114,
"ب س م الل ه الر ح م":114,
"س م الل ه الر ح م ن":114,
"م الل ه الر ح م ن الر":114,
"ه الر ح م ن الر ح يم":114,
"ي ا أ ي ه ا ال ذ":93,
"ا أ ي ه ا ال ذ ين":92,
"أ ي ه ا ال ذ ين آم":89
- Top 10 sequences of length 9:
"الس م او ات و ال أ ر ض":133,
"الل ه الر ح م ن الر ح يم":114,
"ب س م الل ه الر ح م ن":114,
"س م الل ه الر ح م ن الر":114,
"م الل ه الر ح م ن الر ح":114,
"ي ا أ ي ه ا ال ذ ين":92,
"أ ي ه ا ال ذ ين آم ن":89,
"ا أ ي ه ا ال ذ ين آم":89,
"ي ه ا ال ذ ين آم ن وا":89,
"م او ات و ال أ ر ض و":65
- Top 10 sequences of length 10:
"ب س م الل ه الر ح م ن الر":114,
"س م الل ه الر ح م ن الر ح":114,
"م الل ه الر ح م ن الر ح يم":114,
"أ ي ه ا ال ذ ين آم ن وا":89,
"ا أ ي ه ا ال ذ ين آم ن":89,
"ي ا أ ي ه ا ال ذ ين آم":89,
"الس م او ات و ال أ ر ض و":65,
"وا و ع م ل وا الص ال ح ات":52,
"آم ن وا و ع م ل وا الص ال":51,
"ن وا و ع م ل وا الص ال ح":51
Well, we can see that the sliding window problem still happens. It's also interesting to note that
the most frequent sequence of length 10 happens 114 times in the Qur'an compared to 34 times in
the Bible, even though the Bible has 2.8 times as many words (under the poor
assumption English and Arabic words have similar information densities). This suggests that the
Qur'an is even more repetitive than the Bible! As the English translations are less repetitive,
this suggests that there are some errors in our assumptions (or that the translators like to throw
in some variety).
We can improve our estimate the information density
by noting that in the translation we are using, there
are about 1.8 Arabic words per English word. So the length 12 sequence that happens 114 times
(see below) should be similar to an English sequence of length 6.6. From the Bible, the most
frequent 6 word phrase happens 125 times and the most frequent 7 word phrase happens 72 times.
This suggests the Qur'an is about equally repetitious as the Bible.
A few last interesting data points:
- length 12: "ب س م الل ه الر ح م ن الر ح يم":114
- length 12: "ي ا أ ي ه ا ال ذ ين آم ن وا":89
- length 15: "ال ذ ين آم ن وا و ع م ل وا الص ال ح ات":50
- length 16: "ت ج ر ي م ن ت ح ت ه ا ال أ ن ه ار":34
- length 26: "ج ن ات ت ج ر ي م ن ت ح ت ه ا ال أ ن ه ار خ ال د ين ف يه ا":16
Longest phrase used more than once:
- length 107: "ون و ال ذ ين ه م ل ف ر وج ه م ح اف ظ ون إ ل ا ع ل ى أ ز و اج ه م أ و م ا م ل ك ت أ ي م ان ه م ف إ ن ه م غ ي ر م ل وم ين ف م ن اب ت غ ى و ر اء ذ ل ك ف أ ول ئ ك ه م ال ع اد ون و ال ذ ين ه م ل أ م ان ات ه م و ع ه د ه م ر اع ون و ال ذ ين ه م":2
Alice's Adventures in Wonderland
I'm doing this quickly and simplifying contractions by removing apostrophes.
From 27,354 words:
- length 2: "said the":210, "of the":133, "said alice":116, "in a":97, "and the":82,
"in the":80, "it was":76, "to the":69, "the queen":65, "as she":61
- length 3: "the mock turtle":51, "the march hare":30, "said the king":29,
"said the hatter":21, "the white rabbit":21, "said the mock":19, "said to herself":19,
"said the caterpillar":18, "said the gryphon":17, "she said to":17
- length 4: "said the mock turtle":19, "she said to herself":16, "a minute or two":11,
"said the march hare":8, "will you wont you":8
Longest phrase used more than twice:
- length 11: "join the dance will you wont you will you wont you":4
Longest phrase used more than once:
- length 31: "come and join the dance will you wont you will you wont you will you join the dance will you wont you will you wont you wont you join the dance":2
War and Peace
From 565,711 words:
- length 2: "of the":4,038, "in the":2,324, "to the":2,318, "and the":1,475,
"at the":1,347, "on the":1,333, "he had":1,219, "did not":1,053, "prince andrew":983, "he was":956
- length 3: "he did not":225, "one of the":186, "out of the":178,
"that he was":156, "as soon as":146, "he could not":129, "up to the":129, "that it was":127,
"commander in chief":125, "did not know":114
- length 4: "the commander in chief":84, "for a long time":78,
"for the first time":69, "at the same time":61, "in the middle of":52,
"the middle of the":48, "the battle of borodino":44, "he did not know":43,
"in front of the":41, "it seemed to him":39
Longest phrase used more than twice:
- length 9: "i shall look forward very much to your return":3
Longest phrase used more than once:
- length 20: "which under his leadership will be directed against the redoubt and come into line with the rest of the forces":2
The Complete Works of William Shakespeare
From 887,289 words:
- length 2: "i am":1,855, "my lord":1,652, "in the":1,643, "i have":1,617, "i will":1,566,
"of the":1,497, "to the":1,430, "it is":1,078, "to be":973, "that i":928
- length 3: "i pray you":242, "i will not":214, "i know not":159, "i do not":158,
"the duke of":156, "i am not":143, "i am a":139, "i would not":128, "my good lord":128,
"and i will":127
- length 4: "with all my heart":47, "another part of the":44, "i know not what":39,
"exeunt act v scene":37, "by william shakespeare dramatis":36,
"william shakespeare dramatis personae":36, "exeunt act ii scene":33,
"the duke of york":33, "give me your hand":32, "i do beseech you":32
Longest phrase used more than twice:
- length 15: "the fox the ape and the humble bee were still at odds being but three":3
Longest phrase used more than once:
- length 64: "reads when as a lions whelp shall to himself unknown without seeking find and be embracd by a piece of tender air and when from a stately cedar shall be loppd branches which being dead many years shall after revive be jointed to the old stock and freshly grow then shall posthumus end his miseries britain be fortunate and flourish in peace and plenty":2
Pride and Prejudice
From 122,149 words:
- length 2: "of the":465, "to be":443, "in the":382, "i am":303, "of her":262, "to the":252,
"it was":251, "mr darcy":244, "of his":235, "she was":212
- length 3: "i am sure":62, "i do not":62, "as soon as":55, "she could not":50,
"that he had":37, "in the world":34, "it would be":34, "i am not":32, "i dare say":31,
"could not be":30
- length 4: "i do not know":19, "at the same time":16, "i am sure i":15, "the rest of the":15,
"in the course of":14, "as soon as they":13, "lady catherine de bourgh":13,
"her uncle and aunt":12, "mr and mrs gardiner":11, "for the sake of":10
Longest phrase used more than twice:
- length 7: "it was not to be supposed that":4
Longest phrase used more than once:
- length 9: "as he had been used to look in hertfordshire":2,
"that was to make him the happiest of men":2,
"there were some very strong objections against the lady":2
Future work
Solve the sliding window problem.
- One idea is to keep all instances of the sequences. Then
start with the first, most freqent sequence and remove all sequence instances that it
intersects with from the top ten. The problem is that it's a lot of record keeping (maybe, but
matching sequences are neighbors in the index) and possibly an n2 sweep to remove
conflicts.
- Another idea is to find the longest sequence used more than once, then remove all
instances of that sequence and find the new longest sequence used more than once. This should
produce interesting results but it will require almost a completely new algorithm as the length
of the sequence is not known. Also, the length starts large and gets smaller, so current
optimizations based on increasing length would be useless.
It might be possible to capture the core/affix relations using some sort of tree, where each
branch represents the addition of an affix and a corresponding weight. There would
need to be some sort of canonicalization of the ording of prefixes and suffixes lest the
tree become a graph. Also, a token can belong to more than one tree so we would need a way to
decide which tree gets it. Assuming these problems can be overcome, it would probably be
interesting to visualize the result as a tree map.
Zapf distributions!
A Second Algorithm
2010-10-06 : I implemented the algorithm that finds the longest duplicated sequences and
removes them iterativey. The results are interesting, but I'm not sure how to turn it into a
metric to measure repetitiveness.
It seems like it would be very interesting to feed into a tree
map, since each word will be counted only once (vs. the simple method where each word starts a
subtree and thus the words are counted n2/2 times). However, this doesn't really
measure repetitiveness, but rather word distribution and highlights common phrases.
Perhaps a better way to do it would be to create a histogram of number of words in
repeated phrases of length n.
Anyway, here's some sample output. Top 20 longest repeated phrases
The Bible
- length 101, count 2: "did offer his offering was one silver charger the weight whereof was an hundred and thirty shekels one silver bowl of seventy shekels after the shekel of the sanctuary both of them full of fine flour mingled with oil for a meat offering one golden spoon of ten shekels full of incense one young bullock one ram one lamb of the first year for a burnt offering one kid of the goats for a sin offering and for a sacrifice of peace offerings two oxen five rams five he goats five lambs of the first year this was the offering of"
- length 100, count 5: "offered his offering was one silver charger the weight whereof was an hundred and thirty shekels one silver bowl of seventy shekels after the shekel of the sanctuary both of them full of fine flour mingled with oil for a meat offering one golden spoon of ten shekels full of incense one young bullock one ram one lamb of the first year for a burnt offering one kid of the goats for a sin offering and for a sacrifice of peace offerings two oxen five rams five he goats five lambs of the first year this was the offering of"
- length 99, count 2: "his offering was one silver charger of the weight of an hundred and thirty shekels one silver bowl of seventy shekels after the shekel of the sanctuary both of them full of fine flour mingled with oil for a meat offering one golden spoon of ten shekels full of incense one young bullock one ram one lamb of the first year for a burnt offering one kid of the goats for a sin offering and for a sacrifice of peace offerings two oxen five rams five he goats five lambs of the first year this was the offering of"
- length 85, count 2: "hated me for they were too strong for me they prevented me in the day of my calamity but the lord was my stay he brought me forth also into a large place he delivered me because he delighted in me the lord rewarded me according to my righteousness according to the cleanness of my hands hath he recompensed me for i have kept the ways of the lord and have not wickedly departed from my god for all his judgments were before me and"
- length 84, count 2: "fourscore and five thousand and when they arose early in the morning behold they were all dead corpses so sennacherib king of assyria departed and went and returned and dwelt at nineveh and it came to pass as he was worshipping in the house of nisroch his god that adrammelech and sharezer his sons smote him with the sword and they escaped into the land of armenia and esarhaddon his son reigned in his stead in those days was hezekiah sick unto death and"
- length 82, count 2: "the house of his precious things the silver and the gold and the spices and the precious ointment and all the house of his armour and all that was found in his treasures there was nothing in his house nor in all his dominion that hezekiah shewed them not then came isaiah the prophet unto king hezekiah and said unto him what said these men and from whence came they unto thee and hezekiah said they are come from a far country"
- length 75, count 2: "them nor serve them for i the lord thy god am a jealous god visiting the iniquity of the fathers upon the children unto the third and fourth generation of them that hate me and shewing mercy unto thousands of them that love me and keep my commandments thou shalt not take the name of the lord thy god in vain for the lord will not hold him guiltless that taketh his name in vain"
- length 62, count 2: "again take root downward and bear fruit upward for out of jerusalem shall go forth a remnant and they that escape out of mount zion the zeal of the lord of hosts shall do this therefore thus saith the lord concerning the king of assyria he shall not come into this city nor shoot an arrow there nor come before it with"
- length 61, count 2: "was over the household and shebna the scribe and joah the son of asaph the recorder to hezekiah with their clothes rent and told him the words of rabshakeh and it came to pass when king hezekiah heard it that he rent his clothes and covered himself with sackcloth and went into the house of the lord and he sent eliakim"
- length 60, count 2: "the lord hath spoken concerning him the virgin the daughter of zion hath despised thee and laughed thee to scorn the daughter of jerusalem hath shaken her head at thee whom hast thou reproached and blasphemed and against whom hast thou exalted thy voice and lifted up thine eyes on high even against the holy one of israel by thy"
- length 58, count 2: "bullocks two rams and fourteen lambs of the first year without blemish and their meat offering and their drink offerings for the bullocks for the rams and for the lambs shall be according to their number after the manner and one goat for a sin offering beside the continual burnt offering his meat offering and his drink offering"
- length 58, count 2: "into the hand of the king of assyria behold thou hast heard what the kings of assyria have done to all lands by destroying them utterly and shalt thou be delivered have the gods of the nations delivered them which my fathers have destroyed as gozan and haran and rezeph and the children of eden which were in"
- length 56, count 2: "satan answered the lord and said from going to and fro in the earth and from walking up and down in it and the lord said unto satan hast thou considered my servant job that there is none like him in the earth a perfect and an upright man one that feareth god and escheweth evil"
- length 55, count 2: "give thanks unto the lord call upon his name make known his deeds among the people sing unto him sing psalms unto him talk ye of all his wondrous works glory ye in his holy name let the heart of them rejoice that seek the lord seek the lord and his strength seek his face"
- length 55, count 2: "of ten shekels full of incense one young bullock one ram one lamb of the first year for a burnt offering one kid of the goats for a sin offering and for a sacrifice of peace offerings two oxen five rams five he goats five lambs of the first year this was the offering of"
- length 53, count 2: "hundred and thirty the priests the children of jedaiah of the house of jeshua nine hundred seventy and three the children of immer a thousand fifty and two the children of pashur a thousand two hundred forty and seven the children of harim a thousand and seventeen the levites the children of jeshua"
- length 50, count 2: "and their meat offering and their drink offerings for the bullocks for the rams and for the lambs shall be according to their number after the manner and one goat for a sin offering beside the continual burnt offering and his meat offering and his drink offering and on the"
- length 50, count 2: "king of israel sent to amaziah king of judah saying the thistle that was in lebanon sent to the cedar that was in lebanon saying give thy daughter to my son to wife and there passed by a wild beast that was in lebanon and trode down the thistle thou"
- length 50, count 2: "the fat that covereth the inwards and all the fat that is upon the inwards and the two kidneys and the fat that is upon them which is by the flanks and the caul above the liver with the kidneys it shall he take away and the priest shall burn"
- length 48, count 2: "because he was wroth there went up a smoke out of his nostrils and fire out of his mouth devoured coals were kindled by it he bowed the heavens also and came down and darkness was under his feet and he rode upon a cherub and did fly"
The Quran
- length 46, count 4: "said unto them will ye not ward off evil lo i am a faithful messenger unto you so keep your duty to allah and obey me and i ask of you no wage therefor my wage is the concern only of the lord of the worlds"
- length 35, count 2: "o children of israel remember my favour wherewith i favoured you and how i preferred you to all creatures and guard yourselves against a day when no soul will in aught avail another nor will"
- length 31, count 2: "then see how dreadful was my punishment after my warnings and in truth we have made the qur an easy to remember but is there any that remembereth the tribe of"
- length 30, count 2: "we believe in allah and that which is revealed unto us and that which was revealed unto abraham and ishmael and isaac and jacob and the tribes and that which"
- length 28, count 2: "perfect his light however much the disbelievers are averse he it is who hath sent his messenger with the guidance and the religion of truth that he may"
- length 28, count 2: "we drowned the others lo herein is indeed a portent yet most of them are not believers and lo thy lord he is indeed the mighty the merciful"
- length 26, count 2: "and unto the tribe of a ad we sent their brother hud he said o my people serve allah ye have no other allah save him"
- length 26, count 2: "canst not make the dead to hear nor canst thou make the deaf to hear the call when they have turned to flee nor canst thou"
- length 26, count 2: "them lo herein is indeed a portent yet most of them are not believers and lo thy lord he is indeed the mighty the merciful the"
- length 25, count 2: "lo herein is indeed a portent yet most of them are not believers and lo thy lord he is indeed the mighty the merciful and"
- length 25, count 2: "marvel ye that there should come unto you a reminder from your lord by means of a man among you that he may warn you"
- length 24, count 2: "doers and unto midian we sent their brother shu eyb he said o my people serve allah ye have no other allah save him"
- length 24, count 2: "lo herein is indeed a portent yet most of them are not believers and lo thy lord he is indeed the mighty the merciful"
- length 23, count 2: "allah will not take you to task for that which is unintentional in your oaths but he will take you to task for"
- length 23, count 2: "in me but i am a messenger from the lord of the worlds i convey unto you the messages of my lord and"
- length 23, count 2: "the tribe of thamud we sent their brother salih he said o my people serve allah ye have no other allah save him"
- length 22, count 2: "all that is in the heavens and all that is in the earth glorifieth allah and he is the mighty the wise"
- length 22, count 2: "the manna and the quails saying eat of the good things wherewith we have provided you they wronged us not but they"
- length 22, count 2: "twain say hath he forbidden the two males or the two females or that which the wombs of the two females contain"
- length 22, count 2: "whom neither man nor jinni will have touched before them which is it of the favours of your lord that ye deny"
- length 107, count 2: "ون و ال ذ ين ه م ل ف ر وج ه م ح اف ظ ون إ ل ا ع ل ى أ ز و اج ه م أ و م ا م ل ك ت أ ي م ان ه م ف إ ن ه م غ ي ر م ل وم ين ف م ن اب ت غ ى و ر اء ذ ل ك ف أ ول ئ ك ه م ال ع اد ون و ال ذ ين ه م ل أ م ان ات ه م و ع ه د ه م ر اع ون و ال ذ ين ه م"
- length 88, count 2: "ون ي ا ب ن ي إ س ر ائ يل اذ ك ر وا ن ع م ت ي ال ت ي أ ن ع م ت ع ل ي ك م و أ ن ي ف ض ل ت ك م ع ل ى ال ع ال م ين و ات ق وا ي و م ا ل ا ت ج ز ي ن ف س ع ن ن ف س ش ي ئ ا و ل ا ي ق ب ل م ن ه ا"
- length 87, count 2: "وا و إ ن ك ن ت م م ر ض ى أ و ع ل ى س ف ر أ و ج اء أ ح د م ن ك م م ن ال غ ائ ط أ و ل ام س ت م الن س اء ف ل م ت ج د وا م اء ف ت ي م م وا ص ع يد ا ط ي ب ا ف ام س ح وا ب و ج وه ك م و أ ي د يك م"
- length 78, count 2: "ت م ن ور ه و ل و ك ر ه ال ك اف ر ون ه و ال ذ ي أ ر س ل ر س ول ه ب ال ه د ى و د ين ال ح ق ل ي ظ ه ر ه ع ل ى الد ين ك ل ه و ل و ك ر ه ال م ش ر ك ون ي ا أ ي ه ا ال ذ ين آم ن وا"
- length 69, count 3: "أ ل ا ت ت ق ون إ ن ي ل ك م ر س ول أ م ين ف ات ق وا الل ه و أ ط يع ون و م ا أ س أ ل ك م ع ل ي ه م ن أ ج ر إ ن أ ج ر ي إ ل ا ع ل ى ر ب ال ع ال م ين أ ت"
- length 67, count 2: "أ ل ا ت ت ق ون إ ن ي ل ك م ر س ول أ م ين ف ات ق وا الل ه و أ ط يع ون و م ا أ س أ ل ك م ع ل ي ه م ن أ ج ر إ ن أ ج ر ي إ ل ا ع ل ى ر ب ال ع ال م ين"
- length 65, count 2: "و ل ق د آت ي ن ا م وس ى ال ك ت اب ف اخ ت ل ف ف يه و ل و ل ا ك ل م ة س ب ق ت م ن ر ب ك ل ق ض ي ب ي ن ه م و إ ن ه م ل ف ي ش ك م ن ه م ر يب"
- length 61, count 2: "إ ل ى ي و م الد ين ق ال ر ب ف أ ن ظ ر ن ي إ ل ى ي و م ي ب ع ث ون ق ال ف إ ن ك م ن ال م ن ظ ر ين إ ل ى ي و م ال و ق ت ال م ع ل وم ق ال"
- length 58, count 2: "ال ذ ين ي ن ق ض ون ع ه د الل ه م ن ب ع د م يث اق ه و ي ق ط ع ون م ا أ م ر الل ه ب ه أ ن ي وص ل و ي ف س د ون ف ي ال أ ر ض أ ول ئ ك"
- length 57, count 2: "الظ ال م ون إ ن ت ت ب ع ون إ ل ا ر ج ل ا م س ح ور ا ان ظ ر ك ي ف ض ر ب وا ل ك ال أ م ث ال ف ض ل وا ف ل ا ي س ت ط يع ون س ب يل ا"
- length 57, count 2: "ع ل ي ه م ل ع ن ة الل ه و ال م ل ائ ك ة و الن اس أ ج م ع ين خ ال د ين ف يه ا ل ا ي خ ف ف ع ن ه م ال ع ذ اب و ل ا ه م ي ن ظ ر ون"
- length 57, count 2: "ف إ ذ ا س و ي ت ه و ن ف خ ت ف يه م ن ر وح ي ف ق ع وا ل ه س اج د ين ف س ج د ال م ل ائ ك ة ك ل ه م أ ج م ع ون إ ل ا إ ب ل يس"
- length 57, count 2: "ق ال ر ب أ و ز ع ن ي أ ن أ ش ك ر ن ع م ت ك ال ت ي أ ن ع م ت ع ل ي و ع ل ى و ال د ي و أ ن أ ع م ل ص ال ح ا ت ر ض اه و أ"
- length 57, count 2: "ه م ي ت ل و ع ل ي ه م آي ات ه و ي ز ك يه م و ي ع ل م ه م ال ك ت اب و ال ح ك م ة و إ ن ك ان وا م ن ق ب ل ل ف ي ض ل ال م ب ين"
- length 56, count 2: "إ ن ك ن ت م ن الص اد ق ين ف أ ل ق ى ع ص اه ف إ ذ ا ه ي ث ع ب ان م ب ين و ن ز ع ي د ه ف إ ذ ا ه ي ب ي ض اء ل لن اظ ر ين ق ال"
- length 56, count 2: "م ال م ن و الس ل و ى ك ل وا م ن ط ي ب ات م ا ر ز ق ن اك م و م ا ظ ل م ون ا و ل ك ن ك ان وا أ ن ف س ه م ي ظ ل م ون و إ ذ ق"
- length 54, count 2: "ا و ال ذ ين آم ن وا و ع م ل وا الص ال ح ات س ن د خ ل ه م ج ن ات ت ج ر ي م ن ت ح ت ه ا ال أ ن ه ار خ ال د ين ف يه ا أ ب د ا"
- length 54, count 2: "ف م ن ث ق ل ت م و از ين ه ف أ ول ئ ك ه م ال م ف ل ح ون و م ن خ ف ت م و از ين ه ف أ ول ئ ك ال ذ ين خ س ر وا أ ن ف س ه م"
- length 51, count 2: "م ه ذ ه ن اق ة الل ه ل ك م آي ة ف ذ ر وه ا ت أ ك ل ف ي أ ر ض الل ه و ل ا ت م س وه ا ب س وء ف ي أ خ ذ ك م ع ذ اب"
- length 51, count 2: "ون ت ل ك أ م ة ق د خ ل ت ل ه ا م ا ك س ب ت و ل ك م م ا ك س ب ت م و ل ا ت س أ ل ون ع م ا ك ان وا ي ع م ل ون"
Alice's Adventures in Wonderland
- length 31, count 2: "come and join the dance will you wont you will you wont you will you join the dance will you wont you will you wont you wont you join the dance"
- length 18, count 2: "beautiful soup beau ootiful soo oop beau ootiful soo oop soo oop of the e e evening beautiful"
- length 13, count 2: "would not join the dance would not could not would not could not"
- length 11, count 2: "and shouting off with his head or off with her head"
- length 10, count 2: "the caterpillar took the hookah out of its mouth and"
- length 10, count 2: "the white rabbit blew three blasts on the trumpet and"
- length 10, count 2: "why did they live at the bottom of a well"
- length 9, count 2: "and shes such a capital one for catching mice"
- length 9, count 2: "edwin and morcar the earls of mercia and northumbria"
- length 9, count 2: "said the duchess and the moral of that is"
- length 9, count 2: "was immediately suppressed by the officers of the court"
- length 9, count 2: "would you tell me said alice a little timidly"
- length 8, count 2: "believe theres an atom of meaning in it"
- length 8, count 2: "i gave her one they gave him two"
- length 8, count 2: "it was as much as she could do"
- length 8, count 2: "they have their tails in their mouths and"
- length 8, count 2: "you might just as well say added the"
- length 7, count 2: "and began to repeat it but her"
- length 7, count 2: "it wasnt very civil of you to"
- length 7, count 2: "oh i beg your pardon cried alice"
War and Peace
- length 20, count 2: "which under his leadership will be directed against the redoubt and come into line with the rest of the forces"
- length 15, count 2: "one of my eyes was sore but now i am on the lookout with both"
- length 14, count 2: "and cross by its three bridges advancing to the same heights as morands and"
- length 13, count 2: "in this manner orders will be given in accordance with the enemys movements"
- length 13, count 2: "we know the laws of inevitability to which it is subject from the"
- length 12, count 2: "according to the point of view from which the action is regarded"
- length 12, count 2: "general campan will move through the wood to seize the first fortification"
- length 12, count 2: "i will detain you no longer general you shall receive my letter"
- length 12, count 2: "that he would not make peace so long as a single armed"
- length 12, count 2: "the ideas of the revolution and the general temper of the age"
- length 11, count 2: "came up to him and took his hand i am very"
- length 11, count 2: "i love you all and have done no harm to anyone"
- length 11, count 2: "you ought to have been there at seven in the morning"
- length 10, count 2: "at ten in the morning of the second of september"
- length 10, count 2: "piti piti piti and ti ti and piti piti piti"
- length 10, count 2: "reddish hands with hairy wrists visible from under the shirt"
- length 10, count 2: "the only conception that can explain the movement of the"
- length 10, count 2: "to the iberian shrine of the mother of god and"
- length 10, count 2: "under the same conditions and with the same character he"
- length 9, count 2: "and he felt the blood rush to his heart"
The Complete Works of William Shakespeare
- length 64, count 2: "reads when as a lions whelp shall to himself unknown without seeking find and be embracd by a piece of tender air and when from a stately cedar shall be loppd branches which being dead many years shall after revive be jointed to the old stock and freshly grow then shall posthumus end his miseries britain be fortunate and flourish in peace and plenty"
- length 30, count 2: "and so am i for phebe phebe and i for ganymede orlando and i for rosalind rosalind and i for no woman silvius it is to be all made of"
- length 25, count 2: "the cuckoo then on every tree mocks married men for thus sings he cuckoo cuckoo cuckoo o word of fear unpleasing to a married ear"
- length 19, count 2: "but do not so i love thee in such sort as thou being mine mine is thy good report"
- length 19, count 2: "our house bequeathed down from many ancestors which were the greatest obloquy i th world in me to lose"
- length 19, count 2: "supposed king that lewis of france is sending over masquers to revel it with him and his new bride"
- length 19, count 2: "to morrow in the battle think on me and fall thy edgeless sword despair and die to richmond thou"
- length 18, count 2: "come hither come hither come hither here shall he see no enemy but winter and rough weather jaques"
- length 18, count 2: "tell him from me that he hath done me wrong and therefore ill uncrown him eret be long"
- length 17, count 2: "let him shun castles safer shall he be upon the sandy plains than where castles mounted stand"
- length 17, count 2: "tell him in hope hell prove a widower shortly ill wear the willow garland for his sake"
- length 16, count 2: "scene another part of the island enter alonso sebastian antonio gonzalo adrian francisco and others gonzalo"
- length 16, count 2: "the duke yet lives that henry shall depose but him outlive and die a violent death"
- length 15, count 2: "o a pit of clay for to be made for such a guest is meet"
- length 15, count 2: "part of king henry the sixth by william shakespeare dramatis personae king henry the sixth"
- length 15, count 3: "the fox the ape and the humble bee were still at odds being but three"
- length 14, count 2: "scene iii alexandria cleopatras palace enter cleopatra charmian iras and alexas cleopatra where is"
- length 13, count 2: "against my will i am sent to bid you come in to dinner"
- length 13, count 2: "alexander nathaniel when in the world i livd i was the worlds commander"
- length 13, count 2: "all double double toil and trouble fire burn and cauldron bubble second witch"
Pride and Prejudice
- length 9, count 2: "as he had been used to look in hertfordshire"
- length 9, count 2: "that was to make him the happiest of men"
- length 9, count 2: "there were some very strong objections against the lady"
- length 8, count 2: "after the health of her family she answered"
- length 8, count 2: "had you behaved in a more gentlemanlike manner"
- length 8, count 2: "i think i have heard you say that"
- length 8, count 2: "will never see you again if you do"
- length 8, count 2: "would do me a great deal of good"
- length 7, count 2: "and i wish with all my heart"
- length 7, count 2: "entered the room with an air more"
- length 7, count 2: "i am sure i do not know"
- length 7, count 2: "i have written to colonel forster to"
- length 7, count 2: "i knew how it would be i"
- length 7, count 2: "i thank you again and again for"
- length 7, count 2: "if one could but go to brighton"
- length 7, count 2: "it is very hard to think that"
- length 7, count 4: "it was not to be supposed that"
- length 7, count 2: "it will be in my power to"
- length 7, count 2: "the carriage drove up to the door"
- length 7, count 2: "there are very few of us who"
C o m m e n t s :
(nothing yet)