Simulation of a large DNA Profile Database, full alleles, 13 loci ,CODIS

Simulation of a large DNA Profile Database, full alleles , 13 loci ,CODIS



13 loci CODIS simulation using RCMP Toronto Caucasian data
http://www.csfs.ca/databases/cfs_CC_ProfilerPlus_freq.htm
http://www.csfs.ca/databases/cfs_CC_Cofiler_freq.htm


' Generating 13 loci x2 profiles ' directing pairs and first divider Dim ph(26) Dim pb(26) ' initialising Random Number Generator - RNG count9 = 0 count8 = 0 Randomize a = 214013 c = 2531011 x0 = Timer z = 2 ^ 24 ' 1 file 'jun27-g' for original, un-directed pairs, source data. ' This file is necessary to check on the performance of the RNG ' when a matched pair is found then it is highly unlikely that ' both sequences as generated, before pair directing, would ' be the same - more likely a manifest of repeat within the RNG ' (reason for adopting the 214013 / 2531011 RNG ) ' Use 'Word' find function on part of the sequences, including pair reversals, ' with luck would include a 'homozygotic' pair eg (3,3) say ,so no reversal ' on that pair Open "jun27-g" For Output As #1 ' outputs directed and divided by first digit Open "jun27-0" For Output As #10 Open "jun27-1" For Output As #11 Open "jun27-2" For Output As #12 Open "jun27-3" For Output As #13 Open "jun27-4" For Output As #14 Open "jun27-5" For Output As #15 Open "jun27-6" For Output As #16 Open "jun27-7" For Output As #17 Open "jun27-8" For Output As #18 Open "jun27-9" For Output As #19 ' change xxxx for different total size ' for xxxx = 10000000 my computer took 5 hours to generate over-night xxxx = 5000 xxxx = xxxx - 1 For x = 0 To xxxx For j = 0 To 1 ' D3 , locus 1 ' RNG random number generator temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.005 Then ph(j) = 11 If ph(j) < 0.01 Then ph(j) = 1 If ph(j) < 0.124 Then ph(j) = 2 If ph(j) < 0.382 Then ph(j) = 3 If ph(j) < 0.623 Then ph(j) = 4 If ph(j) < 0.84 Then ph(j) = 5 If ph(j) < 0.988 Then ph(j) = 6 If ph(j) < 0.998 Then ph(j) = 7 If ph(j) < 1 Then ph(j) = 8 If ph(j) > 10 Then ph(j) = 0 Next j For j = 2 To 3 ' vWA locus 2 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.002 Then ph(j) = 11 If ph(j) < 0.084 Then ph(j) = 1 If ph(j) < 0.193 Then ph(j) = 2 If ph(j) < 0.42 Then ph(j) = 3 If ph(j) < 0.691 Then ph(j) = 4 If ph(j) < 0.903 Then ph(j) = 5 If ph(j) < 0.987 Then ph(j) = 6 If ph(j) < 1 Then ph(j) = 7 If ph(j) > 10 Then ph(j) = 0 Next j For j = 4 To 5 ' D8 , locus 3 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.012 Then ph(j) = 11 If ph(j) < 0.022 Then ph(j) = 1 If ph(j) < 0.107 Then ph(j) = 2 If ph(j) < 0.187 Then ph(j) = 3 If ph(j) < 0.324 Then ph(j) = 4 If ph(j) < 0.64 Then ph(j) = 5 If ph(j) < 0.865 Then ph(j) = 6 If ph(j) < 0.976 Then ph(j) = 7 If ph(j) < 1 Then ph(j) = 8 If ph(j) > 10 Then ph(j) = 0 Next j For j = 6 To 7 ' D5 , locus 4 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.002 Then ph(j) = 11 If ph(j) < 0.004 Then ph(j) = 1 If ph(j) < 0.031 Then ph(j) = 2 If ph(j) < 0.099 Then ph(j) = 3 If ph(j) < 0.435 Then ph(j) = 4 If ph(j) < 0.824 Then ph(j) = 5 If ph(j) < 0.985 Then ph(j) = 6 If ph(j) < 1 Then ph(j) = 7 If ph(j) > 10 Then ph(j) = 0 Next j For j = 8 To 9 ' D13 , locus 5 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.111 Then ph(j) = 11 If ph(j) < 0.198 Then ph(j) = 1 If ph(j) < 0.268 Then ph(j) = 2 If ph(j) < 0.543 Then ph(j) = 3 If ph(j) < 0.862 Then ph(j) = 4 If ph(j) < 0.944 Then ph(j) = 5 If ph(j) < 1 Then ph(j) = 6 If ph(j) > 10 Then ph(j) = 0 Next j For j = 10 To 11 ' D7, locus 6 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.02 Then ph(j) = 11 If ph(j) < 0.162 Then ph(j) = 1 If ph(j) < 0.302 Then ph(j) = 2 If ph(j) < 0.589 Then ph(j) = 3 If ph(j) < 0.816 Then ph(j) = 4 If ph(j) < 0.955 Then ph(j) = 5 If ph(j) < 0.993 Then ph(j) = 6 If ph(j) < 1 Then ph(j) = 7 If ph(j) > 10 Then ph(j) = 0 Next j For j = 12 To 13 ' D16, locus 7 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.017 Then ph(j) = 11 If ph(j) < 0.133 Then ph(j) = 1 If ph(j) < 0.182 Then ph(j) = 2 If ph(j) < 0.472 Then ph(j) = 3 If ph(j) < 0.824 Then ph(j) = 4 If ph(j) < 0.972 Then ph(j) = 5 If ph(j) < 1 Then ph(j) = 6 If ph(j) > 10 Then ph(j) = 0 Next j For j = 14 To 15 ' FGA, locus 8 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj pb(j) = "Z" If ph(j) < 0.002 Then ph(j) = 11 If ph(j) < 0.011 Then ph(j) = 1 If ph(j) < 0.091 Then ph(j) = 2 If ph(j) < 0.245 Then ph(j) = 3 If ph(j) < 0.429 Then ph(j) = 4 If ph(j) < 0.431 Then ph(j) = 5 If ph(j) < 0.605 Then ph(j) = 6 If ph(j) < 0.612 Then ph(j) = 7 If ph(j) < 0.755 Then ph(j) = 8 If ph(j) < 0.897 Then ph(j) = 9 If ph(j) < 0.963 And ph(j) >= 0.897 Then pb(j) = "A" If ph(j) < 0.993 And ph(j) >= 0.963 Then pb(j) = "B" If ph(j) < 1 And ph(j) >= 0.993 Then pb(j) = "C" If ph(j) > 10 Then ph(j) = 0 If pb(j) <> "Z" Then ph(j) = pb(j) Next j For j = 16 To 17 ' D21, locus 9 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj pb(j) = "Z" If ph(j) < 0.005 Then ph(j) = 11 If ph(j) < 0.041 Then ph(j) = 1 If ph(j) < 0.201 Then ph(j) = 2 If ph(j) < 0.452 Then ph(j) = 3 If ph(j) < 0.454 Then ph(j) = 4 If ph(j) < 0.683 Then ph(j) = 5 If ph(j) < 0.7 Then ph(j) = 6 If ph(j) < 0.785 Then ph(j) = 7 If ph(j) < 0.874 Then ph(j) = 8 If ph(j) < 0.886 Then ph(j) = 9 If ph(j) < 0.964 And ph(j) >= 0.886 Then pb(j) = "A" If ph(j) < 0.996 And ph(j) >= 0.964 Then pb(j) = "B" If ph(j) < 0.998 And ph(j) >= 0.996 Then pb(j) = "C" If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "D" If ph(j) > 10 Then ph(j) = 0 If pb(j) <> "Z" Then ph(j) = pb(j) Next j For j = 18 To 19 ' D18, locus 10 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj pb(j) = "Z" If ph(j) < 0.005 Then ph(j) = 11 If ph(j) < 0.017 Then ph(j) = 1 If ph(j) < 0.137 Then ph(j) = 2 If ph(j) < 0.256 Then ph(j) = 3 If ph(j) < 0.435 Then ph(j) = 4 If ph(j) < 0.602 Then ph(j) = 5 If ph(j) < 0.742 Then ph(j) = 6 If ph(j) < 0.863 Then ph(j) = 7 If ph(j) < 0.924 Then ph(j) = 8 If ph(j) < 0.955 Then ph(j) = 9 If ph(j) < 0.974 And ph(j) >= 0.955 Then pb(j) = "A" If ph(j) < 0.986 And ph(j) >= 0.974 Then pb(j) = "B" If ph(j) < 0.991 And ph(j) >= 0.986 Then pb(j) = "C" If ph(j) < 0.998 And ph(j) >= 0.991 Then pb(j) = "D" If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "E" If ph(j) > 10 Then ph(j) = 0 If pb(j) <> "Z" Then ph(j) = pb(j) Next j For j = 20 To 21 ' THO1 , locus 11 ' RNG random number generator temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.239 Then ph(j) = 11 If ph(j) < 0.403 Then ph(j) = 1 If ph(j) < 0.512 Then ph(j) = 2 If ph(j) < 0.679 Then ph(j) = 3 If ph(j) < 0.988 Then ph(j) = 4 If ph(j) < 1 Then ph(j) = 5 If ph(j) > 10 Then ph(j) = 0 Next j For j = 22 To 23 ' TPOX, locus 12 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.543 Then ph(j) = 11 If ph(j) < 0.632 Then ph(j) = 1 If ph(j) < 0.7 Then ph(j) = 2 If ph(j) < 0.968 Then ph(j) = 3 If ph(j) < 1 Then ph(j) = 4 If ph(j) > 10 Then ph(j) = 0 Next j For j = 24 To 25 ' CSF1PO , locus 13 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.002 Then ph(j) = 11 If ph(j) < 0.031 Then ph(j) = 1 If ph(j) < 0.301 Then ph(j) = 2 If ph(j) < 0.62 Then ph(j) = 3 If ph(j) < 0.928 Then ph(j) = 4 If ph(j) < 0.991 Then ph(j) = 5 If ph(j) < 0.998 Then ph(j) = 6 If ph(j) < 1 Then ph(j) = 7 If ph(j) > 10 Then ph(j) = 0 Next j ' output the original generated file Write #1, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) ' Because in real DNA profiles without further info ,no one ' knows which allele in each pair came from the mother or father ' by convention they are written smaller ,larger (or equal). ' The following directs each pair For j = 0 To 24 Step 2 If ph(j + 1) < ph(j) Then jjj = ph(j) ph(j) = ph(j + 1) ph(j + 1) = jjj End If Next j ' put extra conditional statements here to reduce ' the number of files or just delete some of the following ' ' dividing on first column, file by file If ph(0) = 0 Then Write #10, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count0 = count0 + 1 End If If ph(0) = 1 Then Write #11, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count1 = count1 + 1 End If If ph(0) = 2 Then Write #12, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count2 = count2 + 1 End If If ph(0) = 3 Then Write #13, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count3 = count3 + 1 End If If ph(0) = 4 Then Write #14, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count4 = count4 + 1 End If If ph(0) = 5 Then Write #15, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count5 = count5 + 1 End If If ph(0) = 6 Then Write #16, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count6 = count6 + 1 End If If ph(0) = 7 Then Write #17, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count7 = count7 + 1 End If If ph(0) = 8 Then Write #18, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count8 = count8 + 1 End If If ph(0) = 9 Then Write #19, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count9 = count9 + 1 End If Next x Close #10 Close #11 Close #12 Close #13 Close #14 Close #15 Close #16 Close #17 Close #18 Close #19 Close #1 ' count file for data to fix for - next loops in sucessive dividings Open "jun27-c" For Output As #20 Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9 Close #20
' Dividing file into 10 by second digit Dim ph(26) dim ps as string ' xxxx = count size from count file xxxx = ' input file Open "jun27-1" For Input As #1 ' 10 divided files Open "jun27-10" For Output As #10 Open "jun27-11" For Output As #11 Open "jun27-12" For Output As #12 Open "jun27-13" For Output As #13 Open "jun27-14" For Output As #14 Open "jun27-15" For Output As #15 Open "jun27-16" For Output As #16 Open "jun27-17" For Output As #17 Open "jun27-18" For Output As #18 Open "jun27-19" For Output As #19 count9 = 0 count8 = 0 xxxx = xxxx - 1 For x = 0 To xxxx Input #1, ps a2$ = Mid(ps, 2, 1) ph(1) = Val(a2$) If ph(1) = 0 Then Write #10, ps count0 = count0 + 1 End If If ph(1) = 1 Then Write #11, ps count1 = count1 + 1 End If If ph(1) = 2 Then Write #12, ps count2 = count2 + 1 End If If ph(1) = 3 Then Write #13, ps count3 = count3 + 1 End If If ph(1) = 4 Then Write #14, ps count4 = count4 + 1 End If If ph(1) = 5 Then Write #15, ps count5 = count5 + 1 End If If ph(1) = 6 Then Write #16, ps count6 = count6 + 1 End If If ph(1) = 7 Then Write #17, ps count7 = count7 + 1 End If If ph(1) = 8 Then Write #18, ps count8 = count8 + 1 End If If ph(1) = 9 Then Write #19, ps count9 = count9 + 1 End If Next x Close #1 Close #10 Close #11 Close #12 Close #13 Close #14 Close #15 Close #16 Close #17 Close #18 Close #19 ' output counts Open "jun27-1c" For Output As #20 Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9 Close #20
' Dividing file into 10 by third digit Dim ph(26) dim ps as string ' enter count in xxxx xxxx = Open "jun27-11" For Input As #1 Open "jun27-110" For Output As #10 Open "jun27-111" For Output As #11 Open "jun27-112" For Output As #12 Open "jun27-113" For Output As #13 Open "jun27-114" For Output As #14 Open "jun27-115" For Output As #15 Open "jun27-116" For Output As #16 Open "jun27-117" For Output As #17 Open "jun27-118" For Output As #18 Open "jun27-119" For Output As #19 count9 = 0 count8 = 0 xxxx=xxxx - 1 For x = 0 To xxxx Input #1, ps a3$ = Mid(ps, 3, 1) ph(2) = Val(a3$) If ph(2) = 0 Then Write #10, ps count0 = count0 + 1 End If If ph(2) = 1 Then Write #11, ps count1 = count1 + 1 End If If ph(2) = 2 Then Write #12, ps count2 = count2 + 1 End If If ph(2) = 3 Then Write #13, ps count3 = count3 + 1 End If If ph(2) = 4 Then Write #14, ps count4 = count4 + 1 End If If ph(2) = 5 Then Write #15, ps count5 = count5 + 1 End If If ph(2) = 6 Then Write #16, ps count6 = count6 + 1 End If If ph(2) = 7 Then Write #17, ps count7 = count7 + 1 End If If ph(2) = 8 Then Write #18, ps count8 = count8 + 1 End If If ph(2) = 9 Then Write #19, ps count9 = count9 + 1 End If Next x Close #1 Close #10 Close #11 Close #12 Close #13 Close #14 Close #15 Close #16 Close #17 Close #18 Close #19 Open "jun27-11c" For Output As #20 Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9 Close #20
' Dividing file into 10 by fourth digit Dim ph(26) dim ps as string ' enter count in xxxx xxxx = Open "jun27-131" For Input As #1 Open "jun27-1310" For Output As #10 Open "jun27-1311" For Output As #11 Open "jun27-1312" For Output As #12 Open "jun27-1313" For Output As #13 Open "jun27-1314" For Output As #14 Open "jun27-1315" For Output As #15 Open "jun27-1316" For Output As #16 Open "jun27-1317" For Output As #17 Open "jun27-1318" For Output As #18 Open "jun27-1319" For Output As #19 count9 = 0 count8 = 0 xxxx=xxxx - 1 For x = 0 To xxxx Input #1, ps a4$ = Mid(ps, 4, 1) ph(3) = Val(a4$) If ph(3) = 0 Then Write #10, ps count0 = count0 + 1 End If If ph(3) = 1 Then Write #11, ps count1 = count1 + 1 End If If ph(3) = 2 Then Write #12, ps count2 = count2 + 1 End If If ph(3) = 3 Then Write #13, ps count3 = count3 + 1 End If If ph(3) = 4 Then Write #14, ps count4 = count4 + 1 End If If ph(3) = 5 Then Write #15, ps count5 = count5 + 1 End If If ph(3) = 6 Then Write #16, ps count6 = count6 + 1 End If If ph(3) = 7 Then Write #17, ps count7 = count7 + 1 End If If ph(3) = 8 Then Write #18, ps count8 = count8 + 1 End If If ph(3) = 9 Then Write #19, ps count9 = count9 + 1 End If Next x Close #1 Close #10 Close #11 Close #12 Close #13 Close #14 Close #15 Close #16 Close #17 Close #18 Close #19 Open "jun27-131c" For Output As #20 Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9 Close #20
' Dividing file into 10 by fifth digit Dim ph(26) Dim ps As String ' enter count in xxxx xxxx = Open "jun27-3412" For Input As #1 Open "jun27-34120" For Output As #10 Open "jun27-34121" For Output As #11 Open "jun27-34122" For Output As #12 Open "jun27-34123" For Output As #13 Open "jun27-34124" For Output As #14 Open "jun27-34125" For Output As #15 Open "jun27-34126" For Output As #16 Open "jun27-34127" For Output As #17 Open "jun27-34128" For Output As #18 Open "jun27-34129" For Output As #19 count9 = 0 count8 = 0 xxxx = xxxx - 1 For x = 0 To xxxx Input #1, ps a5$ = Mid(ps, 5, 1) ph(4) = Val(a5$) If ph(4) = 0 Then Write #10, ps count0 = count0 + 1 End If If ph(4) = 1 Then Write #11, ps count1 = count1 + 1 End If If ph(4) = 2 Then Write #12, ps count2 = count2 + 1 End If If ph(4) = 3 Then Write #13, ps count3 = count3 + 1 End If If ph(4) = 4 Then Write #14, ps count4 = count4 + 1 End If If ph(4) = 5 Then Write #15, ps count5 = count5 + 1 End If If ph(4) = 6 Then Write #16, ps count6 = count6 + 1 End If If ph(4) = 7 Then Write #17, ps count7 = count7 + 1 End If If ph(4) = 8 Then Write #18, ps count8 = count8 + 1 End If If ph(4) = 9 Then Write #19, ps count9 = count9 + 1 End If Next x Close #1 Close #10 Close #11 Close #12 Close #13 Close #14 Close #15 Close #16 Close #17 Close #18 Close #19 Open "jun27-3412c" For Output As #20 Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9 Close #20
The next is sorting using Word Tables/ Sort Before using ,make a test batch of numbers as there are various Sort outcomes. Now I'm using string data, Text sort gave the right form on my machine. Use Ctrl+shift+Home(or End) to highlight text up or down . After sort and before saving to disk press up or down arrow to select which way the text is returned to you. My set-up was limited to no more than 15,000. To sort say 28,000 sort upper half ,then lower half then cut and paste say 0 to 2 section of lower half into the top of the top half. Re-sort the expanded 0 to 2 section then re-sort the remainder. If say selecting 2 to 3 section then cut and paste at the juncture of 2 and 3 in the other block to save some repeated sorting. Other times it is quicker to oversort then backtrack / overlap on the next sort. Many of the subdivision files are empty because of the directing. They consist of eg 4,4.. 4,5.... etc never 4,0.., 4,1.. etc and a number of 8 and 9 sections are absent back to the generator characteristics eg only first 8 of 10 are used. When you know all files are less than 15,000, or whatever Sort limit, use the next (simply a recorded macro) to sort 10 related files. An empty file will stop the macro so edit out empty files before running.
'Sort 10 related files in one go ' Documents.Open FileName:="jun27-130", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-131", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-132", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-133", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-134", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-135", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-136", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-137", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-138", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save ' Documents.Open FileName:="jun27-139", ConfirmConversions:=False, ReadOnly _ :=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _ :="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _ , Format:=wdOpenFormatAuto Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _ SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _ FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _ wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _ wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _ wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _ :=wdLanguageNone ActiveDocument.Save
' It is possible to sometimes join the above sort macro ' with this re-concattenation macro ' ' empty files will append spurious carriage returns at ' head or tail of files so check for this before final match routine ' otherwise use Insert / File to merge files ' merge 10 related files back to one ' for convenience I named these re-concattenated ' files as .txt so they were obvious in listing ' compared to no suffix ones ' in windows explorer the file sizes of these .txt files ' should have the same file-size as the original ' unexpanded and unsorted file ' use "Word" find for any repeated repeatedend of line "^p^p" ' Documents.Add Template:="", NewTemplate:=False Selection.InsertFile FileName:="jun27-130", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-131", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-132", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-133", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-134", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-135", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-136", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-137", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-138", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False Selection.InsertFile FileName:="jun27-139", Range:="", ConfirmConversions _ :=False, Link:=False, Attachment:=False ActiveDocument.SaveAs FileName:="jun27-13.txt", FileFormat:=wdFormatText, _ LockComments:=False, Password:="", AddToRecentFiles:=True, WritePassword _ :="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _ SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _ False
Copy and paste all these subfiles together to submit to the next section. The final match finding, initially for 12 digits ,then change to 14,16,18 and finally 20 if 18 shows something. This routine after hours of dividing/sorting/re-merging takes only seconds to complete.
' Find matching pairs in 12 digits ' ie not including triples ( there was an error in previous ' versions of these macros that over-inflated the number ' of pair matches by the number of triple situations ) ' xxxx is count from the count files xxxx = b$ = "0" c$ = "0" Count = 0 Dim ps As String xxxx = xxxx - 1 Open "jun27-1.txt" For Input As #1 Open "jun27-0pair12.txt" For Output As #2 ' change the 12 in the #2 file name above and ' the Left function below to suit number of matches For x = 0 To xxxx Input #1, ps a$ = Left(ps, 12) a2$ = ps If a$ = b$ Then Write #2, a2$, b2$ Count = Count + 1 End If If a$ = c$ Then Count = Count - 1 If a$ = b$ Then c$ = b$ c2$ = b2$ End If b$ = a$ b2$ = a2$ Next x Write #2, "Count ", Count Close #1 Close #2
' Find matching triples in 12 digits ' xxxx is count xxxx = b$ = "0" c$ = "0" Count = 0 Dim ps As String xxxx = xxxx - 1 Open "jun27-1.txt" For Input As #1 Open "jun27-1trip12.txt" For Output As #2 ' change the 12 in the #2 file name above and ' the Left function below to suit number of matches For x = 0 To xxxx Input #1, ps a$ = Left(ps, 12) a2$ = ps If a$ = c$ Then Write #2, a2$, b2$, c2$ Count = Count + 1 End If If a$ = d$ Then Count = Count - 1 If a$ = b$ Then d$ = c$ d2$ = c2$ End If If a$ = b$ Then c$ = b$ c2$ = b2$ End If b$ = a$ b2$ = a2$ Next x Write #2, "Count ", Count Close #1 Close #2
' Find matching quadruples in 12 digits ' xxxx is the count xxxx = b$ = "0" c$ = "0" Count = 0 Dim ps As String xxxx = xxxx - 1 Open "jun27-3.txt" For Input As #1 Open "jun27-3quad12.txt" For Output As #2 ' change the 12 in the #2 file name above and ' the Left function below to suit number of matches For x = 0 To xxxx Input #1, ps a$ = Left(ps, 12) a2$ = ps If a$ = d$ Then Write #2, a2$, b2$, c2$, d2$ Count = Count + 1 End If If a$ = e$ Then Count = Count - 1 If a$ = b$ Then e$ = d$ e2$ = d2$ End If If a$ = c$ Then d$ = c$ d2$ = c2$ End If If a$ = b$ Then c$ = b$ c2$ = b2$ End If b$ = a$ b2$ = a2$ Next x Write #2, "Count ", Count Close #1 Close #2
' Find matching quintuples in 12 digits ' xxxx is the count xxxx = b$ = "0" c$ = "0" Count = 0 Dim ps As String xxxx = xxxx - 1 Open "jun27-4.txt" For Input As #1 Open "jun27-4quin12.txt" For Output As #2 ' change the 12 in the #2 file name above and ' the Left function below to suit number of matches For x = 0 To xxxx Input #1, ps a$ = Left(ps, 12) a2$ = ps If a$ = e$ Then Write #2, a2$, b2$, c2$, d2$, e2$ Count = Count + 1 End If If a$ = f$ Then Count = Count - 1 If a$ = b$ Then f$ = e$ f2$ = e2$ End If If a$ = d$ Then e$ = d$ e2$ = d2$ End If If a$ = c$ Then d$ = c$ d2$ = c2$ End If If a$ = b$ Then c$ = b$ c2$ = b2$ End If b$ = a$ b2$ = a2$ Next x Write #2, "Count ", Count Close #1 Close #2
Cross-check for generator comparing output allele frequencies against intended allele frequencies for 5,000 profiles D3 Intended Generated (per cent ) 0 0.5 0.62 1 0.5 0.48 2 11.4 10.9 3 25.8 25.5 4 24.1 23.96 5 21.7 21.57 6 14.8 15.69 7 1.0 1.06 8 0.2 0.22 VWA 0 .2 .25 1 8.2 7.82 2 10.9 10.4 3 22.7 22.81 4 27.1 27.91 5 21.2 21.04 6 8.4 8.32 7 1.4 1.45 D8 0 1.2 1.1 1 1.0 .97 2 8.5 8.37 3 8.0 7.76 4 13.7 14.39 5 31.6 31.55 6 22.5 22.43 7 11.1 11.07 8 2.4 2.36 D5 0 .2 .22 1 .2 .19 2 2.7 2.54 3 6.8 6.9 4 33.6 33.45 5 38.9 39.09 6 16.0 15.92 7 1.5 1.59 D13 0 11.1 10.87 1 8.7 8.54 2 7.0 6.8 3 27.5 27.7 4 31.9 32.92 5 8.2 8.22 6 5.6 4.95 D7 0 2.0 2.17 1 14.2 14.25 2 14.0 14.45 3 28.7 28.33 4 22.7 22.43 5 14.0 13.93 6 3.8 3.61 7 0.7 0.83 D16 0 17.0 16.9 1 11.6 12.28 2 4.9 4.86 3 29.0 28.72 4 35.2 34.6 5 14.8 15.11 6 2.7 2.74 FGA 0 0.2 0.23 1 0.9 0.89 2 8.0 8.85 3 15.4 15.43 4 18.4 17.66 5 0.2 0.2 6 17.4 17.48 7 0.7 0.65 8 14.3 14.39 9 14.2 14.57 A 6.7 6.36 B 3.1 2.7 C 0.7 0.59 D21 0 0.5 0.57 1 3.6 3.63 2 16.0 16.59 3 25.1 24.81 4 0.2 0.15 5 22.9 22.79 6 1.7 15.2 7 8.5 8.33 8 8.9 8.61 9 1.2 1.0 A 7.8 7.99 B 3.2 3.53 C 0.2 0.25 D 0.2 0.23 D18 0 0.5 0.51 1 1.2 11.1 2 11.9 12.54 3 11.9 11.67 4 17.9 17.47 5 16.7 16.67 6 14.0 13.81 7 12.1 12.1 8 6.1 6.16 9 3.1 3.22 A 19.0 19.1 B 1.2 1.4 C 0.5 0.59 D 0.7 0.66 E 0.2 0.18 THO1 0 23.9 23.4 1 16.4 16.32 2 10.9 11.39 3 16.7 17.2 4 30.9 30.58 5 1.2 1.11 TPOX 0 54.3 53.77 1 8.9 8.93 2 6.8 6.82 3 26.8 26.98 4 3.2 3.5 CSF1PO 0 0.2 0.18 1 2.9 3.13 2 27.0 26.72 3 31.9 31.79 4 30.9 31.33 5 6.3 5.86 6 0.7 0.77 7 0.2 0.22

Results

For 13 loci in order D3,vWA,D8,D5,D13,D7,D16,FGA,D21,D18,THO1,TPOX,CSF1PO and using RCMP data for Toronto caucasian allele frequencies http://www.csfs.ca/databases/cfs_CC_ProfilerPlus_freq.htm http://www.csfs.ca/databases/cfs_CC_Cofiler_freq.htm and 10 million profiles - no pair mtches on 12 or 13 loci The random number generator worked fine for 260 million calls in sequence as no 13 loci false matches and I checked back to the generator file for each of the following 4 and all started as various undirected sequences. 4 separate 11 loci pairs of matches converted back to standard representation D3,vWA,D8,D5,D13,D7,D16,FGA,D21,D18,THO1 (15,16)(17,18)(13,14)(11,12)(8,11)(10,11)(11,11)(20,24)(29,31)(15,16)(6,9.3) (15,16)(14,17)(13,14)(11,12)(11,12)(10,12)(12,13)(20,21)(29,29)(13,15)(6,9.3) (15,18)(16,18)(10,14)(12,12)(9,12)(11,12)(12,12)(20,22)(29,29)(13,15)(6,6) (18,18)(16,17)(14,15)(11,11)(11,12)(9,9)(12,12)(20,23)(28,32.2)(16,17)(9.3,9.3) The minimum allele frequencies for these 4 matches was 8.5 per cent, 8.2 , 8.5 and 7.8 Then each case one allele of each of the locus 12 and 13 matching. Any 11 loci from 13 would produce many more matches of course. That is despite 2 chances of 54 percent for each 'person' to inherit allele 8 on TPOX or 79 percent for at least one TPOX 8 allele. I would conjecture that to obtain one CODIS match would require about 26 million profiles. Summary of results so far would suggest this Table for totally random (no inter-relatedness ) 'populations' of 4 and 10 million 6 loci : 1 in 14,000 7 loci : 1 in 78,000 8 loci : 1 in 0.26 million 9 loci : 1 in 0.87 million 10 loci : 1 in 1.8 million 11 loci : 1 in 4.5 million 12 loci : 1 in 11 million (projected) 13 loci : 1 in 26 million (projected) The fractional discontinuity at 9 to 10 loci probably due to swapping from UK loci to US loci. The most remarkable result is the effect on triple matches and higher n-tuple matches. The following is a bit apples and pears because the 4 million results were for UK and 10 million for US loci , 7 of which are the same. Loci / Pairs in 4 million / 10 million 6- 88,000/ 1.46 million 7- 2,600/ 535,000 8- 243/ 33,000 9- 21/ 1,570 10- 4/ 44 11- -/ 4 So a 2.5 factor increase in population increases the number of reduced loci matches phenomenally - well above any square law or even cubes - even as much as to the power of 5 For the 10 loci situation despite change from UK to USA loci the reults are consistent with square law. 4.8 matches in 4 million UK would suggest 4.8 * 2.5^2 for 10 million or 30 predicted. US 10 million gioves 44 Loci / Triples in 4 million / 10m 6- 8,018/ 626,000 Then 10 million only 7- 105,000 8- 739 9- 2 Loci / Quadruples in 4 million / 10m 6- 1024/ 338,000 Then 10 million only 7- 32,000 8- 35 Loci / Quintuples in 4 million / 10m 6- 176/ 207,000 Then for 10 million only 7- 12,700 8- 2 Remember the forensic statisticians in the 1990s quoting in court the probability for an unrelated 6 locus profile match to this defendent being 37 million to 1 against. 37 million 6 locus profiles would be totally swamped with n-tuple repeats let alone pair matches To go to a 30 million profile simulation I would have to automate a lot of the processing to be done overnight as the generation would take 16 hours and just the sorting, on my pc, would take perhaps 100 hours and all require 10G Byte of free disk space.
' Generating 13 loci x2 profiles excluding rare alleles ' as set here for all allele frequencies ' less than 8% 0.08 ' To change from 8% to 5% or whatever delete the appropriate ' flagged allele conditionals eg "If ph(j) = "2" Then flag = 1" ' directing pairs and first divider Dim ph(26) Dim pb(26) ' initialising Random Number Generator - RNG count9 = 0 count8 = 0 countf = 0 Randomize a = 214013 c = 2531011 x0 = Timer z = 2 ^ 24 ' 1 file 'jul06-g' for original, un-directed pairs, source data. ' This file is necessary to check on the performance of the RNG ' when a matched pair is found then it is highly unlikely that ' both sequences as generated, before pair directing, would ' be the same - more likely a manifest of repeat within the RNG ' (reason for adopting the 214013 / 2531011 RNG ) ' Use 'Word' find function on part of the sequences, including pair reversals, ' with luck would include a 'homozygotic' pair eg (3,3) say ,so no reversal ' on that pair Open "jul06-g" For Output As #1 ' outputs directed and divided by first digit Open "jul06-0" For Output As #10 Open "jul06-1" For Output As #11 Open "jul06-2" For Output As #12 Open "jul06-3" For Output As #13 Open "jul06-4" For Output As #14 Open "jul06-5" For Output As #15 Open "jul06-6" For Output As #16 Open "jul06-7" For Output As #17 Open "jul06-8" For Output As #18 Open "jul06-9" For Output As #19 ' change xxxx for different total size, no large numbers are required ' as only obtaining a scaling factor xxxx = 26000 xxxx = xxxx - 1 For x = 0 To xxxx flag = 0 For j = 0 To 1 ' D3 , locus 1 ' RNG random number generator temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.005 Then ph(j) = 11 If ph(j) < 0.01 Then ph(j) = 1 If ph(j) < 0.124 Then ph(j) = 2 If ph(j) < 0.382 Then ph(j) = 3 If ph(j) < 0.623 Then ph(j) = 4 If ph(j) < 0.84 Then ph(j) = 5 If ph(j) < 0.988 Then ph(j) = 6 If ph(j) < 0.998 Then ph(j) = 7 If ph(j) < 1 Then ph(j) = 8 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "0" Then flag = 1 If ph(j) = "1" Then flag = 1 If ph(j) = "7" Then flag = 1 If ph(j) = "8" Then flag = 1 Next j For j = 2 To 3 ' vWA locus 2 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.002 Then ph(j) = 11 If ph(j) < 0.084 Then ph(j) = 1 If ph(j) < 0.193 Then ph(j) = 2 If ph(j) < 0.42 Then ph(j) = 3 If ph(j) < 0.691 Then ph(j) = 4 If ph(j) < 0.903 Then ph(j) = 5 If ph(j) < 0.987 Then ph(j) = 6 If ph(j) < 1 Then ph(j) = 7 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "0" Then flag = 1 If ph(j) = "7" Then flag = 1 Next j For j = 4 To 5 ' D8 , locus 3 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.012 Then ph(j) = 11 If ph(j) < 0.022 Then ph(j) = 1 If ph(j) < 0.107 Then ph(j) = 2 If ph(j) < 0.187 Then ph(j) = 3 If ph(j) < 0.324 Then ph(j) = 4 If ph(j) < 0.64 Then ph(j) = 5 If ph(j) < 0.865 Then ph(j) = 6 If ph(j) < 0.976 Then ph(j) = 7 If ph(j) < 1 Then ph(j) = 8 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "0" Then flag = 1 If ph(j) = "1" Then flag = 1 If ph(j) = "3" Then flag = 1 If ph(j) = "8" Then flag = 1 Next j For j = 6 To 7 ' D5 , locus 4 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.002 Then ph(j) = 11 If ph(j) < 0.004 Then ph(j) = 1 If ph(j) < 0.031 Then ph(j) = 2 If ph(j) < 0.099 Then ph(j) = 3 If ph(j) < 0.435 Then ph(j) = 4 If ph(j) < 0.824 Then ph(j) = 5 If ph(j) < 0.985 Then ph(j) = 6 If ph(j) < 1 Then ph(j) = 7 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "0" Then flag = 1 If ph(j) = "1" Then flag = 1 If ph(j) = "2" Then flag = 1 If ph(j) = "3" Then flag = 1 If ph(j) = "7" Then flag = 1 Next j For j = 8 To 9 ' D13 , locus 5 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.111 Then ph(j) = 11 If ph(j) < 0.198 Then ph(j) = 1 If ph(j) < 0.268 Then ph(j) = 2 If ph(j) < 0.543 Then ph(j) = 3 If ph(j) < 0.862 Then ph(j) = 4 If ph(j) < 0.944 Then ph(j) = 5 If ph(j) < 1 Then ph(j) = 6 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "2" Then flag = 1 If ph(j) = "6" Then flag = 1 Next j For j = 10 To 11 ' D7, locus 6 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.02 Then ph(j) = 11 If ph(j) < 0.162 Then ph(j) = 1 If ph(j) < 0.302 Then ph(j) = 2 If ph(j) < 0.589 Then ph(j) = 3 If ph(j) < 0.816 Then ph(j) = 4 If ph(j) < 0.955 Then ph(j) = 5 If ph(j) < 0.993 Then ph(j) = 6 If ph(j) < 1 Then ph(j) = 7 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "0" Then flag = 1 If ph(j) = "6" Then flag = 1 If ph(j) = "7" Then flag = 1 Next j For j = 12 To 13 ' D16, locus 7 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.017 Then ph(j) = 11 If ph(j) < 0.133 Then ph(j) = 1 If ph(j) < 0.182 Then ph(j) = 2 If ph(j) < 0.472 Then ph(j) = 3 If ph(j) < 0.824 Then ph(j) = 4 If ph(j) < 0.972 Then ph(j) = 5 If ph(j) < 1 Then ph(j) = 6 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "0" Then flag = 1 If ph(j) = "2" Then flag = 1 If ph(j) = "6" Then flag = 1 Next j For j = 14 To 15 ' FGA, locus 8 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj pb(j) = "Z" If ph(j) < 0.002 Then ph(j) = 11 If ph(j) < 0.011 Then ph(j) = 1 If ph(j) < 0.091 Then ph(j) = 2 If ph(j) < 0.245 Then ph(j) = 3 If ph(j) < 0.429 Then ph(j) = 4 If ph(j) < 0.431 Then ph(j) = 5 If ph(j) < 0.605 Then ph(j) = 6 If ph(j) < 0.612 Then ph(j) = 7 If ph(j) < 0.755 Then ph(j) = 8 If ph(j) < 0.897 Then ph(j) = 9 If ph(j) < 0.963 And ph(j) >= 0.897 Then pb(j) = "A" If ph(j) < 0.993 And ph(j) >= 0.963 Then pb(j) = "B" If ph(j) < 1 And ph(j) >= 0.993 Then pb(j) = "C" If ph(j) > 10 Then ph(j) = 0 If pb(j) <> "Z" Then ph(j) = pb(j) If ph(j) = "0" Then flag = 1 If ph(j) = "1" Then flag = 1 If ph(j) = "2" Then flag = 1 If ph(j) = "5" Then flag = 1 If ph(j) = "7" Then flag = 1 If pb(j) = "A" Then flag = 1 If pb(j) = "B" Then flag = 1 If pb(j) = "C" Then flag = 1 Next j For j = 16 To 17 ' D21, locus 9 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj pb(j) = "Z" If ph(j) < 0.005 Then ph(j) = 11 If ph(j) < 0.041 Then ph(j) = 1 If ph(j) < 0.201 Then ph(j) = 2 If ph(j) < 0.452 Then ph(j) = 3 If ph(j) < 0.454 Then ph(j) = 4 If ph(j) < 0.683 Then ph(j) = 5 If ph(j) < 0.7 Then ph(j) = 6 If ph(j) < 0.785 Then ph(j) = 7 If ph(j) < 0.874 Then ph(j) = 8 If ph(j) < 0.886 Then ph(j) = 9 If ph(j) < 0.964 And ph(j) >= 0.886 Then pb(j) = "A" If ph(j) < 0.996 And ph(j) >= 0.964 Then pb(j) = "B" If ph(j) < 0.998 And ph(j) >= 0.996 Then pb(j) = "C" If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "D" If ph(j) > 10 Then ph(j) = 0 If pb(j) <> "Z" Then ph(j) = pb(j) If ph(j) = "0" Then flag = 1 If ph(j) = "1" Then flag = 1 If ph(j) = "4" Then flag = 1 If ph(j) = "6" Then flag = 1 If ph(j) = "9" Then flag = 1 If pb(j) = "A" Then flag = 1 If pb(j) = "B" Then flag = 1 If pb(j) = "C" Then flag = 1 If pb(j) = "D" Then flag = 1 Next j For j = 18 To 19 ' D18, locus 10 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj pb(j) = "Z" If ph(j) < 0.005 Then ph(j) = 11 If ph(j) < 0.017 Then ph(j) = 1 If ph(j) < 0.137 Then ph(j) = 2 If ph(j) < 0.256 Then ph(j) = 3 If ph(j) < 0.435 Then ph(j) = 4 If ph(j) < 0.602 Then ph(j) = 5 If ph(j) < 0.742 Then ph(j) = 6 If ph(j) < 0.863 Then ph(j) = 7 If ph(j) < 0.924 Then ph(j) = 8 If ph(j) < 0.955 Then ph(j) = 9 If ph(j) < 0.974 And ph(j) >= 0.955 Then pb(j) = "A" If ph(j) < 0.986 And ph(j) >= 0.974 Then pb(j) = "B" If ph(j) < 0.991 And ph(j) >= 0.986 Then pb(j) = "C" If ph(j) < 0.998 And ph(j) >= 0.991 Then pb(j) = "D" If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "E" If ph(j) > 10 Then ph(j) = 0 If pb(j) <> "Z" Then ph(j) = pb(j) If ph(j) = "0" Then flag = 1 If ph(j) = "1" Then flag = 1 If ph(j) = "8" Then flag = 1 If ph(j) = "9" Then flag = 1 If pb(j) = "A" Then flag = 1 If pb(j) = "B" Then flag = 1 If pb(j) = "C" Then flag = 1 If pb(j) = "D" Then flag = 1 If pb(j) = "E" Then flag = 1 Next j For j = 20 To 21 ' THO1 , locus 11 ' RNG random number generator temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.239 Then ph(j) = 11 If ph(j) < 0.403 Then ph(j) = 1 If ph(j) < 0.512 Then ph(j) = 2 If ph(j) < 0.679 Then ph(j) = 3 If ph(j) < 0.988 Then ph(j) = 4 If ph(j) < 1 Then ph(j) = 5 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "5" Then flag = 1 Next j For j = 22 To 23 ' TPOX, locus 12 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.543 Then ph(j) = 11 If ph(j) < 0.632 Then ph(j) = 1 If ph(j) < 0.7 Then ph(j) = 2 If ph(j) < 0.968 Then ph(j) = 3 If ph(j) < 1 Then ph(j) = 4 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "2" Then flag = 1 If ph(j) = "4" Then flag = 1 Next j For j = 24 To 25 ' CSF1PO , locus 13 ' RNG temp = x0 * a + c temp = temp / z x1 = (temp - Fix(temp)) * z x0 = x1 phj = x1 / z ph(j) = phj If ph(j) < 0.002 Then ph(j) = 11 If ph(j) < 0.031 Then ph(j) = 1 If ph(j) < 0.301 Then ph(j) = 2 If ph(j) < 0.62 Then ph(j) = 3 If ph(j) < 0.928 Then ph(j) = 4 If ph(j) < 0.991 Then ph(j) = 5 If ph(j) < 0.998 Then ph(j) = 6 If ph(j) < 1 Then ph(j) = 7 If ph(j) > 10 Then ph(j) = 0 If ph(j) = "0" Then flag = 1 If ph(j) = "1" Then flag = 1 If ph(j) = "5" Then flag = 1 If ph(j) = "6" Then flag = 1 If ph(j) = "7" Then flag = 1 Next j If flag = 1 Then countf = countf + 1 If flag = 0 Then ' output the original generated file Write #1, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) ' output the original generated file For j = 0 To 24 Step 2 If ph(j + 1) < ph(j) Then jjj = ph(j) ph(j) = ph(j + 1) ph(j + 1) = jjj End If Next j ' put extra conditional statements here to reduce ' the number of files or just delete some of the following ' ' dividing on first column, file by file If ph(0) = 0 Then Write #10, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count0 = count0 + 1 End If If ph(0) = 1 Then Write #11, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count1 = count1 + 1 End If If ph(0) = 2 Then Write #12, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count2 = count2 + 1 End If If ph(0) = 3 Then Write #13, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count3 = count3 + 1 End If If ph(0) = 4 Then Write #14, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count4 = count4 + 1 End If If ph(0) = 5 Then Write #15, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count5 = count5 + 1 End If If ph(0) = 6 Then Write #16, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count6 = count6 + 1 End If If ph(0) = 7 Then Write #17, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count7 = count7 + 1 End If If ph(0) = 8 Then Write #18, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count8 = count8 + 1 End If If ph(0) = 9 Then Write #19, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25) count9 = count9 + 1 End If End If Next x Close #10 Close #11 Close #12 Close #13 Close #14 Close #15 Close #16 Close #17 Close #18 Close #19 Close #1 ' count file for data to fix for - next loops in sucessive dividings ' count for >= 8.0% allele frequencies is xxxx minus countf Open "jul06-c" For Output As #20 Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9, "count flag ", xxxx - countf Close #20
So for the moment assuming 26 million totally random profiles to produce one 13 locus match then the above simulation produces the following results for that subset of the population with 13 locus profiles with no rare alleles and all allele frequencies greater than 1 percent then 1 in 17.5 million 3 per cent then 1 in 7 million 5 per cent then 1 in 5.9 million 8 per cent then 1 in 1.6 million I woul be interested to hear from anyone who is aware of processed large swathes of data showing non- Hardy-Weinberg equilibrium. That is cross-linking of loci/alleles showing up in real populations (similar to inheritance of blonde hair / blue eyes versus dark hair / dark eyes sort of effect). I've only, so far, seen references to Non H-WE effects in certain populations and specific loci but not data precise to the loci and alleles involved. As it is not across the board effect with all loci then it is hard to say it is a manifest of an inbred population. To go higher than 10 million I will have to automate a lot of this processing to do it hands-off over night or for hours at a time. First to determine the how the successive file dividings break down to give all file sizes below the maximum sort size under Word97. For locus L(h) with alleles i and j and allele frequency a. The sum of all allele frequencies is 1 so for S read sigma S(a(i,j)) = 1 for all h For X profiles in total For the locus L(h) the fraction of total in first column and allele i is F(h,i) = F(h-1)*[1+S(R(>i)) -S(R(i)) + S(R(i)) is the sum of the allele frequencies more than the ith and R(i)) S(R(

Email Paul Nutteing by removing 4 of the 5 dots
or email Paul Nutteing ,remove all but one dot
Or a message on usenet group uk.legal has got to me recently a couple of times.
My original accounts nutteing@quickfindit.com and the nutteing2@quickfindit account are now dead. A lot of the contents of this file plus other material 'peer reviewed' on the main forensic science usergroup

Background
A simulation of a large DNA profile database
A simulation of DNA profile 'families'
A simulation of DNA profile families with consanguinity
A simulation of DNA profile 'families' for 6 generations
dnas.htm revisited with all alleles represented
dnas.htm revisited for >8 percent allele frequency subset (similar ancestry )
Simulation of Taiwanese Tao and Rukai populations to explore the effect of within and without ancestral clusters
Basques autochthonous DNA profiles simulation, 9 loci
Australian Capital Caucasian 9 loci simulation
Australian Capital Caucasian 9 loci simulation, >= 5% allele frequency
CODIS, 13 Loci Caucasian Simulation
Automating the macros
Exploring other DNA profile match scenarios
Suspect familial matching
Return to co-ancestry factor in the NDNAD simulations
144 random matches in 65,000 -- ONLY?


Powered by counter.bloke.com