Lab
3 Details for MPCS 56600
Each
lab
will consist of a small problem and details of how to proceed.
Each lab is intended to give every student hands-on experience
with the core concepts and technologies covered during the
course. A student may concentrate, as a team member, on
one technology over another for the final project, but labs are
designed to give each and every student exposure to all the
technologies that come into play. You need to submit labs
to the TAs for grading--see submission instructions below.
Generally, unless otherwise specified, you will have one week to
complete each assigned lab.
See
the
syllabus for information on grading. Turning in lab
assignments on time is required, without exception, and all late deliveries will be
penalized, regardless
of cause. Submit your assignments to the
subversion repository according to the directions on the
syllabus page.
You
may
write these solutions in any programming language of
your choice. Our suggestion is now is not the time to
learn a new programming language along with the concepts
themselves. So our suggestion is to use whatever
programming language you know best.
Lab 3
Due: 5:00 pm, Monday, April 29, 2019
Problem
1:
Brute Force Cryptanalysis:
BACKGROUND:
Like
all programming problems, learning a new technology is not an
exercise in reading but rather an exercise in thinking and
typing. This lab is designed to give you more hands-on
experience in some fundamental skills in cryptography. You
will generally find the References section below helpful in
addition to the required and recommended reading.
One of the best ways to go about understand how ciphertext is
created is by deciphering it. This lab is designed to give
you experience in doing precisely that.
ADVICE: READ through the entire lab before
beginning work...
WHAT YOU NEED TO DO:
STEP 1:
A brute force approach to deciphering a rotation
cipher is straightforward. You simply run through the set of
rotation factors [0-26], and print out each result. Then,
look at the 26 different results, and find the one that looks like
English. Let's encrypt the string "Now is the winter of our
discontent made glorious summer". Let's use a rotation of
8. Our encrypted text is thus:
vwe qa bpm eqvbmz wn wcz lqakwvbmvb uilm otwzqwca acuumz
Using a brute force method outlined in class, we would roll
through the various possibilities, incrementing the rotation
factor from 01 to 26, and derive these results:
01: WXF RB CQN FRWCNA XO XDA MRBLXWCNWC VJMN PUXARXDB
BDVVNA
02: XYG SC DRO GSXDOB YP YEB NSCMYXDOXD WKNO QVYBSYEC
CEWWOB
03: YZH TD ESP HTYEPC ZQ ZFC OTDNZYEPYE XLOP RWZCTZFD
DFXXPC
04: ZAI UE FTQ IUZFQD AR AGD PUEOAZFQZF YMPQ SXADUAGE
EGYYQD
05: ABJ VF GUR JVAGRE BS BHE QVFPBAGRAG ZNQR TYBEVBHF
FHZZRE
06: BCK WG HVS KWBHSF CT CIF RWGQCBHSBH AORS UZCFWCIG
GIAASF
07: CDL XH IWT LXCITG DU DJG SXHRDCITCI BPST VADGXDJH
HJBBTG
08: DEM YI JXU MYDJUH EV EKH TYISEDJUDJ CQTU WBEHYEKI
IKCCUH
09: EFN ZJ KYV NZEKVI FW FLI UZJTFEKVEK DRUV XCFIZFLJ
JLDDVI
10: FGO AK LZW OAFLWJ GX GMJ VAKUGFLWFL ESVW YDGJAGMK
KMEEWJ
11: GHP BL MAX PBGMXK HY HNK WBLVHGMXGM FTWX ZEHKBHNL
LNFFXK
12: HIQ CM NBY QCHNYL IZ IOL XCMWIHNYHN GUXY AFILCIOM
MOGGYL
13: IJR DN OCZ RDIOZM JA JPM YDNXJIOZIO HVYZ BGJMDJPN
NPHHZM
14: JKS EO PDA SEJPAN KB KQN ZEOYKJPAJP IWZA CHKNEKQO
OQIIAN
15: KLT FP QEB TFKQBO LC LRO AFPZLKQBKQ JXAB DILOFLRP
PRJJBO
16: LMU GQ RFC UGLRCP MD MSP BGQAMLRCLR KYBC EJMPGMSQ
QSKKCP
17: MNV HR SGD VHMSDQ NE NTQ CHRBNMSDMS LZCD FKNQHNTR
RTLLDQ
18: NOW IS THE WINTER OF OUR DISCONTENT MADE GLORIOUS
SUMMER
19: OPX JT UIF XJOUFS PG PVS EJTDPOUFOU NBEF HMPSJPVT
TVNNFS
20: PQY KU VJG YKPVGT QH QWT FKUEQPVGPV OCFG INQTKQWU
UWOOGT
21: QRZ LV WKH ZLQWHU RI RXU GLVFRQWHQW PDGH JORULRXV
VXPPHU
22: RSA MW XLI AMRXIV SJ SYV HMWGSRXIRX QEHI KPSVMSYW
WYQQIV
23: STB NX YMJ BNSYJW TK TZW INXHTSYJSY RFIJ LQTWNTZX
XZRRJW
24: TUC OY ZNK COTZKX UL UAX JOYIUTZKTZ SGJK MRUXOUAY
YASSKX
25: UVD PZ AOL DPUALY VM VBY KPZJVUALUA THKL NSVYPVBZ
ZBTTLY
If we scan through the output of gobbledygook, notice line
number 18. That looks like English! And very fine
English, as it's Shakespeare's English. That's basically
it as far as a brute force method is concerned on a rotation
cipher.
Your first assignment:
We just got in the
following cipher text which we believe was encrypted using a
Caesarean rotation cipher:
maxzxxlxmatmetbwmaxzhewxgxzzlunmgxoxkvtvdexw
1. Code a program in your programming language of choice
(Java/haskell/C#/python/Go/ruby, etc.) and decipher the
encrypted text above using a brute force method.
2. One of the 26 lines of output should look like English
(without spaces...)
3. At the end of your program, print out the correct
rotation factor (you can hard-code this after discovering the
correct rotation factor)
4. At the end of the
program, print out the plaintext (you can hard-code this after
discovering the correct rotation factor)
5. And at the end of the program, print out who said this (you can hard-code this after
discovering the correct rotation factor)
Submit your source code for your solution as described below under
"Submitting."
Problem
2: Statistical Cryptanalysis:
BACKGROUND:
Like
all programming problems, learning a new technology is not an
exercise in reading but rather an exercise in thinking and
typing. This lab is designed to give you more hands-on
experience in some fundamental skills in cryptography. You
will generally find the References section below helpful in
addition to the required and recommended reading.
WHAT YOU NEED TO DO:
STEP 1:
We have received a text document (here) that contains encrypted data we
believe were created with a rotation cipher. You will use a
letter frequency analysis approach to decrypt this text. But
first, keep reading...
You may use the letter frequency tables below (these are drawn
from this wikipedia
article on this subject):



Relative
frequencies of letters in text
Relative
frequencies ordered by frequency
According to Robert Lewand (Cryptological Mathematics, The
Mathematical Association of America, 2000), arranged from most to
least common in appearance, the English letters are as follows:
etaoinshrdlcumwfgypbvkjxqz
Meaning that the English letter 'e' is most common, and the
English letter 'z' is least common across multiple inputs.
Using the common letter frequencies above, and ignoring all
numbers, spaces and punctuation and other non-alphabetic
characters, your job will be to decrypt a message to plain text
using statistical letter frequency analysis. Luckily, our
ciphertext is long enough to provide sufficient statistical
significance required for this type of frequency analysis.
There are a couple of approaches to this problem. One is to
create two lists, one, a list of the letter frequencies in our
actual encrypted ciphertext data that we are analyzing
(which you would need to produce from the encrypted data), let's
call this list Lc, and a second list of the standard
letter frequencies above (etaoinshrdlcumwfgypbvkjxqz),
let's call this Ls, and create a new string that
replaces the most common letter in the encrypted data (x)
with the corresponding (x) standard order letter, such
that: Lc[x] = Ls[x]. So for example, assume
that our plain text is this (it's not, but assume):
Meet in lobby at ten: "meetinlobbyatten" (we will ignore
spaces and punctuation for the time being).
And let's assume an array of 26 lowercase letters [a-z], and
let's further assume that this
text has been enciphered with a ROT9
encryption, where each character in the ciphertext is rotated
9 characters in the plaintext modulus 26, yielding the
following ciphertext (ignoring case):
meetinlobbyatten => dvvkzecfssprkkve
Since we know the rotation, we know that the first letter 'd'
(ASCII 100d) in the ciphertext is really the letter 'm' (ASCII
100 + 9 = 109 = 'm'), as we see in the first letter of the
plaintext. The next letter in the ciphertext (again,
ignoring spaces and punctuation) is 'v' (ASCII 118d), which
corresponds to the letter 'e' in the plaintext (ASCII
(101-118) mod26 [we are assuming an array of 26 lowercase
letters a-z]).
If we roll through the ciphertext creating a frequency
distribution of just the the letters in the ciphertext, we
come up with the following order:
{'v': 3, 'k': 3, 'e': 2, 's': 2, 'd': 1, 'z': 1, 'c': 1, 'f':
1, 'p': 1, 'r': 1}
There is a nice pdf that discusses this type of
letter frequency analysis (but on text more complicated than
ours here, which is not statistically significant to provide
fodder for our statistical analysis algorithm) located here.
The author, Steven Gordon, has even written a shell script
that will perform multiple types of encryption and decryption,
and it is located
here.
Download it, save it to a file called "crypt", and make it
executable, and play around with it, such as:
1. Perform a brute force attack on our sample ciphertext
(notice line 17 of the brute force method):
for i
in `seq 0 25`; do echo -n "$i "; crypto caesar dec
dvvkzecfssprkkve $i ; done ;
2. Or, performing the decryption knowing the offset
value:
crypto caesar dec dvvkzecfssprkkve 17
Play around with the tool and read the pdf mentioned
above. And then, move on to:
A second, and preferable way to
approach to this type of problem is perform a statistical
analysis such as the following:
1. Produce the frequencies of the letters in the encrypted
text (using again the same simple ciphertext of "dvvkzecfssprkkve",
ranked from most frequent to least frequent, along with their
counts:
{'v': 3, 'k': 3, 'e': 2, 's': 2, 'd': 1,
'z': 1, 'c': 1, 'f': 1, 'p': 1, 'r': 1}
2. Find the distance (shift) from the most common letter
in the ciphertext (in the case of a tie, just choose the
first) to the most common letter in the English Frequency
Table, which is as we know, the letter 'e'. In this
case, the shift from 'v' to 'e' mod 26 is 9 (note 17 + 9 =
26). This gives us a ROT9(x) for this
ciphertext. (See the table below for the full shift
values ROT9).
3. Create two lists (or combine/zip two lists, a
2-dimensional array, or a map, your call...), one with the
normal English alphabet of lowercase letters a-z as in the
first row below, let's call that List1.
Next, create a ROT9 list and call this List2,
as in the second row below. The "match point" in List2
is at the shift distance 9 (again, the shift from 'v' to 'e' mod 26
is 9), specifically, letter 'v' in
List1
is mapped to the letter 'e' in List2,
and the rest of List2 is just
wrapped around to complete: viz. 'w' is
mapped to 'f', 'x' is mapped to 'g', 'y' is mapped to 'h', 'z'
is mapped to 'i', and we wrap around to map 'a' to
'j', etc.
Thus we will have two lists that we wish to combine or map:

4. Then simply roll through the ciphertext character by
character, and replace each character encountered in the
ciphertext with it's corresponding English letter
indexed by the position of the ciphertext letter from the
plaintext alphabet in List1.
So let's work through the first four substitutions in the
ciphertext. Our simple ciphertext again is
"dvvkzecfssprkkve", and the first four letters in the
ciphertext are "dvvk". Here's how we would do this:
Using the table above, we look up 'd' in List1, and
find it's index, which is 3. We would then replace the
text at ciphertext[0] with the letter at List2[3], which is 'm'. That's
the first letter of our plaintext. Next, look up 'v' in List1, and
find it's index, which is 21. We would then
replace the text at ciphertext[1] with the letter
at List2[21],
which is 'e'. That's the second letter
of our plaintext. We'd do the same
thing with the second 'v' in our first for
ciphertext letters, replacing the character
at ciphertext[2] with the letter at List2[21],
which is 'e'.
That's our third
letter.
Finally, for our
four letters, we'd look
up 'k' in List1,
and find it's
index, which
is 10.
We would then
replace the
text at
ciphertext[3]
with the
letter at List2[10],
which is
't'.
That's our
fourth letter.
Our decrypted string
so far is
"meet". That's
pretty good.
We'd continue on
doing the same thing
until we have gone
through the entire
ciphertext, and our
result would be:
"meetinlobbyatten"
Here is a picture of
the ROT9(x)
table for our
example
(P=Plaintext,
C=Ciphertext):

By doing this form of statistical substitution, we leave
intact any punctuation in the ciphertext, which will carry
over into the decrypted plaintext just fine.
Ok, now that we know what we're doing, we got an intercept in this morning, and
below is the following ciphertext, and we
believe it was enciphered using a caesarean cipher:
"r glivcp gvvi kf gvvi mvijzfe fw vcvtkifezt trjy nflcu rccfn
feczev grpdvekj kf sv jvek uzivtkcp wifd fev grikp kf refkyvi
nzkyflk xfzex kyiflxy r wzeretzrc zejkzklkzfe. uzxzkrc jzxerklivj
gifmzuv grik fw kyv jfclkzfe, slk kyv drze svevwzkj riv cfjk zw r
kiljkvu kyziu grikp zj jkzcc ivhlzivu kf givmvek uflscv-jgveuzex.
nv gifgfjv r jfclkzfe kf kyv uflscv-jgveuzex gifscvd ljzex r
gvvi-kf-gvvi evknfib. kyv evknfib kzdvjkrdgj kirejrtkzfej sp
yrjyzex kyvd zekf re fexfzex tyrze fw yrjy-srjvu giffw-fw-nfib,
wfidzex r ivtfiu kyrk treefk sv tyrexvu nzkyflk ivufzex kyv
giffw-fw-nfib. kyv cfexvjk tyrze efk fecp jvimvj rj giffw fw kyv
jvhlvetv fw vmvekj nzkevjjvu, slk giffw kyrk zk trdv wifd kyv
crixvjk gffc fw tgl gfnvi. rj cfex rj r drafizkp fw tgl gfnvi zj
tfekifccvu sp efuvj kyrk riv efk tffgvirkzex kf rkkrtb kyv
evknfib, kyvp'cc xvevirkv kyv cfexvjk tyrze reu flkgrtv rkkrtbvij.
kyv evknfib zkjvcw ivhlzivj dzezdrc jkiltkliv. dvjjrxvj riv
sifrutrjk fe r svjk vwwfik srjzj, reu efuvj tre cvrmv reu ivafze
kyv evknfib rk nzcc, rttvgkzex kyv cfexvjk giffw-fw-nfib tyrze rj
giffw fw nyrk yrggvevu nyzcv kyvp nviv xfev."
Using the statistical approach outlined above:
1. Code a program in your programming language of choice
(Java/haskell/C#/python/Go/ruby, etc.) and decipher the encrypted
text above using the statistical method discussed in this section,
and have your program print out the plaintext.
2. Print out who said this and
in what work (you can
hard-code this after deciphering the plaintext)
3. Print out the rotation factor (you can hard-code this after
discovering the rotation factor)
4. Submit your source code for your solution per the
instructions below.
Problem
3: Hash Functions:
BACKGROUND:
Like
all programming problems, learning a new technology is not an
exercise in reading but rather an exercise in thinking and
typing. This lab is designed to give you more hands-on
experience in some fundamental skills in cryptography. You
will generally find the References section below helpful in
addition to the required and recommended reading.
WHAT YOU NEED TO DO:
STEP 1:
Identify
the SHA-256 hash functions in your language's libraries.
You can find examples of SHA-256 functions in multiple
languages in my pub directory located here:
~mark/pub/56600/src.lecture/lecture.2/sha256/examples/*.
You are
to take the following two sets of data:
{
0000000036dc2ce23cdd934eff4bae120155de8b8712de8489c8870b06e334ff,
c0ba5ba1c5ba02fb02c4ea93ec16cae7ea1994b5b3f9ef0e293b841081eb3003,
the current calculated timestamp in
seconds in Unix Epoch format (i.e., numeric)
1.00000,
5251252
}
{
00000000d84724559f1691d916b2ed63a44884d495d155197647ce7667116b16,
69a14e6b050d10d6621faee3dac6682809feb0ffa76320b33c5c09f1059f06c7,
the current calculated timestamp in
seconds in Unix Epoch format (i.e., numeric)
1.00000,
124867778
}
You are
to produce two separate bytestreams of these two sets of
data. Then you are to write a program in your programming language of choice
(Java/haskell/C#/python/Go/ruby, etc.) that will
serialize the data into a bytestream for each set of data (using
whatever capabilities your language has for doing so), and do
the following:
1.
Double SHA-256 hash the first set of bytestream info (will look
something like sha256(sha256(set1_bytestream))) and store the
resulting digest in a variable called sha_left.
Print out sha_left.
2. Double SHA-256 the second set of bytestream info (will
look something like sha256(sha256(set2_bytestream))) and store
the resulting digest in a variable called sha_right. Print
out sha_right.
3. Single SHA-256 hash your two hashed results,
sha_left and sha_right, which will look something like
sha256(sha_left + sha_right). [Note you are concatenating
two 32-byte hex strings of numbers here, not mathematically
adding them.]. Store the resulting digest in a variable
called root. Print out root. [Note that when
testing, your output will change between runs because the
timestamp will likely be different...]
Save
your program, test it, and submit it as per the instructions
below.
References:
ASCII
Table: https://www.asciitable.com
Letter Frequency Help: https://sandilands.info/sgordon/classical-ciphers-frequency-analysis-examples
Steven Gordon's pdf : https://sandilands.info/sgordon/doc/classical-ciphers-frequency-analysis-examples
Submitting:
Upload your Lab 3 code and
any supporting materials to your PhoenixForge repo, including your response
file. Please include a README text file that contains any
instructions for the TAs to assist with grading, and design
notes are often the most useful thing you can provide. Please
provide info on how to compile and run your code.