Wizard

COMP 524: Programming Language Concepts

Spring, 2008
Jeff Terrell
jsterrel AT cs.unc.edu
(919) 962-1791 (office: Sitterson 138)

COMP 524 Exercise 3 (Scheme)

General Instructions

First, review the assignment submission policy in the syllabus. Note that there is no collaboration allowed.

This assignment is due at 11:59pm on Tuesday, April 1 (no foolin'). Submit assignments to me via email. All of your functions should be included in a file called 'exercise3.scm', and 'exercise3.scm' should include everything that you turn in. When I am finished grading your assignment, I will email you your grade and any comments that I have (assuming I have your permission for this). There are a total of 100 points and 5 bonus points.

Remember: start early! I guarantee my availability during office hours, but not at 11pm the night it is due.

When you turn in exercise3.scm, there should be no statements at the top level that produce output or require input. Merely provide the required functions and anything the functions need--do not include any code related to your testing or debugging. When I import your functions into another script, I want the import to be silently successful. Thank you.


Background

Conventional interfaces are a powerful way to program. They have two components: a standard data representation and a set of simple operations. Each operation takes data in the standard form and returns data in the standard form, so operations can be combined to form pipelines. A well-known conventional interface is the Unix command line, which is what this exercise is modeled after.

The standard data representation in Scheme is, naturally, a list. So, we will define a field to be an element of a list, a record to be a list of fields, and a glob to be a list of records. Globs are the standard data representation passed between operations. Fields are either numbers or strings.

First, we will create the conventional interface. Then, we will use it.


Part A: Creating the Conventional Interface

A.1 - (sum glob)

Details
Given a glob of records, each of which has a single number field, return the sum of the numbers.
Example
(define a_glob (list
  (list 1)
  (list 2)
  (list 3)
))
(sum a_glob)
6
Points
5 points
Par
2 minutes

A.2 - (filter test glob)

Details
Given a glob and a test, filter the glob so that only records that pass the test remain. Note: the built-in list-ref function might be useful here.
Example
(define b_glob (list
  (list 17 "foo" 0)
  (list 8 "bar" 1)
  (list 1 "yadda" 1)
))

(filter (lambda (rec) (odd? (list-ref rec 0)))
        b_glob)
((17 "foo" 0) (1 "yadda" 1))

(filter (lambda (rec) (= (list-ref rec 2) 1))
        b_glob)
((8 "bar" 1) (1 "yadda" 1))
Points
5 points
Par
6 minutes

A.3 - (cut fields glob)

Details
Only keep certain fields in a glob, discarding others. fields is a list of the fields to keep, with indexes starting at 0. Note that the length of each record in the output glob is the same as the length of the fields parameter.
Example
; let b_glob be defined as above

(cut (list 0 2) b_glob)
(
  (17 0)
  (8 1)
  (1 1)
)


(cut (list 1) b_glob)
(
  ("foo")
  ("bar")
  ("yadda")
)
Points
10 points
Par
5 minutes

A.4 - (record= rec1 rec2)

Details
Determine whether two records are equal. They are equal if and only if every field in each record is equal.
Hint
Because the = operator does not accept strings, you'll want to use the string? and string=? functions, as documented here.
Example
(record= (list 1 2 3) (list 4 5 6))
#f   ; (that is, false)

(record= (list 1 2 3) (list 1 2 4))
#f

(record= (list 1 "foo" 3) (list 1 "foo" 4))
#f

(record= (list 1 2 3) (list 1 2 3))
#t
Points
5 points
Par
5 minutes

A.5 - (uniq glob)

Details
Eliminate successive duplicate records, using the record= function. In other words, if the next record is equal to the current record, only save one copy of the record. Note: do not keep track of all records. If a record is a duplicate of another record, but the duplicate doesn't appear immediately after the original, then do not delete it. See the example.
Example
(define c_glob (list
  (list 1 2 3)
  (list 1 2 3)
  (list 1 2 3)
  (list 1 2 4)
  (list 1 2 4)
  (list 1 2 3)
))

(uniq c_glob)
(
  (1 2 3)
  (1 2 4)
  (1 2 3)
)
Points
10 points
Par
5 minutes

A.6 - (uniq-with-counts glob)

Details
This is like the uniq function above, except that a count field is appended to each record with the number of duplicates seen.
Hint
You might find the built-in append function useful here.
Example
; c_glob is defined as above

(uniq-with-counts c_glob)
(
  (1 2 3 3)
  (1 2 4 2)
  (1 2 3 1)
)
Points
10 points
Par
9 minutes

A.7 - (record< rec1 rec2)

Details

Determine whether one record is less than another. The records are compared element-by-element in the same way as a string is compared character-by-character.

In other words, A < B if:

  • A[0] < B[0], or
  • A[0] == B[0] and A[1] < B[1], or
  • A[0] == B[0] and A[1] == B[1] and A[2] < B[2], or
  • and so on...
Example
(record< (list 1 2 3) (list 4 5 6))
#t

(record< (list 1 2 3) (list 1 2 2))
#f

(record< (list 1 2 2) (list 1 2 3))
#t

(record< (list 1 "foo" 2) (list 1 "bar" 3))
#f

(record< (list 1 2 3) (list 1 2 3))
#f
Points
5 points
Par
6 minutes

A.8 - (sort glob)

Details
Using the record< function, sort a glob so that the records are in order, and return the sorted glob. Hint: use bubble sort.
Example
(define e_glob (list
  (list 1 "foo" 0)
  (list 8 "bar" 1)
  (list 1 "yadda" 1)
))

(sort e_glob)
(
  (1 "foo" 0)
  (1 "yadda" 1)
  (8 "bar" 1)
)
Points
10 points
Par
20 minutes

Part B: Using the Conventional Interface

Now, use the conventional interface to answer the following questions. Do not use any functions to answer these questions other than the ones you defined above, plus length.

Each of these functions will be named B.1 through B.9. Each function will accept a glob as an argument and will return the answer. Depending on the question, the answer might be a single number, or it might be a glob. Each correct answer is worth 5 points unless otherwise noted.

The data set you will use to answer the following questions is available here. It contains a list of packet arrival events. The list is in standard Scheme syntax, and calling the load function on the file will import the list into the variable packets. There are 6 fields:

  1. timestamp_second - the second part of the time at which the packet arrived
  2. timestamp_microsecond - the microsecond part of the time
  3. local_IP - the "local" IP address
  4. direction - the direction of the packet: either < or >
  5. remote_IP - the "remote" IP address
  6. packet_length - the size of the packet, in bytes

B.1 - # local IP's

Description
How many unique local_IP's are there?
Hint
uniq doesn't remove duplicates unless they are next to each other. So, you need to figure out a way to arrange the glob so that duplicates are next to each other. Thankfully, you already created a function to do this...

B.2 - # remote IP's

Description
How many unique remote_IP's are there?

B.3 - # flows

Background
A flow is defined as a (source_address, destination_address) pair. In other words, a flow is a combination of (local_IP, direction, remote_IP). (Note that a connection between two hosts consists of two flows: one in either direction.)
Description
How many unique flows are there?

B.4 - packets per second

Description
How many packets are there per second? Your answer should be a glob with two fields: the second and the count.

B.5 - packets per flow

Description
How many packets are there per flow? Your answer should be a glob with four fields: the local_IP, direction, remote_IP, and count.

B.6 - # outgoing packets

Description
How many packets were seen going from a local_IP to a remote_IP (i.e. with direction == ">")?

B.7 - # outgoing bytes

Description
How many bytes were seen going from a local_IP to a remote_IP?

B.8 - 46-byte packets at odd seconds

Description
How many 46-byte packets were seen during odd-numbered seconds?

B.9 - histogram of packet sizes (bonus)

Description
How many packets of size X were seen, for each value of X? Your answer should be a glob with two fields: the value of X and the count. This question is worth 5 bonus points.
exercise3.php: Last Modified: 04/15/08@22:27:10 | Size: 12238 bytes | View Source Valid XHTML 1.1 Valid CSS