Skip to main content

Homework 6: Heaps & Polling

Due: 2024-11-17 23:59:00EDT

Overview

For this assignment, you’ll continue your previous investigation of Democratic primary polling data by determining which candidate is currently in the lead.

1. Implement a Max Heap

First, download the PriorityQueue interface: wget https://raw.githubusercontent.com/BMC-CS-151/BMC-CS-151.github.io/main/hws/hw06/PriorityQueue.java

You should start by implementing the given PriorityQueue interface with a heap so that generic objects that implement the compareTo function from the Comparable interface can be inserted into your priority queue.

Requirements

For this assignment (as in the lab), you can assume a max capacity of 100 and do not need to expand your underlying array.

  1. Start with implementing an array-based binary tree ArrayBinaryTree that implements the given interface BinaryTree from the last assignment. This is your first experience writing two different implementations of the same data structure (that conform to the same interface). Most of what you need to do below should have been practiced in Lab08, so make sure your lab is working first.

    A. You are not required to use only recursive implementation techniques this time around (because recursive is not the best choice for array-based). You only need to implement a binary tree, not a binary search tree. However, your binary tree should be complete.

    B. Remember that an arbitrary binary tree is not guaranteed to be complete (for example, the user might execute remove on any element any time) and you need to handle the case where positions may become null through out your tree.

    C. The three traversal methods have the same specification as the last homework.

    D. Override toString the same way as the last assignment, i.e. For a tree rooted at b with a left node a and a right node c, return a string of the following format:

     Tree:
     Pre: b a c
     In: a b c
     Post: a c b
    

    Where the first traversal is a pre-order traversal, the second is in-order, and the last is post-order.

    E. Add a method toStringBreadthFirst which returns a string containing the elements of the tree traversed in breadth-first traversal order. Note that this method is not in the interface because it is not a common traversal order for all binary trees, but is useful for displaying/debugging array-based ones.

  2. Now implement the PriorityQueue interface as a ArrayHeap which extends ArrayBinaryTree. Note that it is recommended to add protected or private helper methods in both ArrayBinaryTree and ArrayHeap as you design deems necessary. On the other hand, your implementation should be properly encapsulated, i.e. no implementation details should be made visible outside of the ArrayHeap or ArrayBinaryTree classes. In other words, any method that leaks implementation details should not be public.

    A. Your implementation of insert should use the compareTo method of the given element to determine which order to arrange the priority queue. Insertion of elements that are already in the tree should update the current element in the priority queue while making any updates necessary to guarantee the heap property. When you put your polling data into the tree this will be equivalent to updating the poll numbers for a candidate. Since this heap will be used to store polling data, you should be implementing this as a maximum heap, so that we will be able to easily retrieve the current top candidate. Note that there is very little actual difference between a max-heap and the min-heap you saw in class and implemented in lab. The only change should occur with compareTo.

    B. Your implementation of remove should ensure the heap property is maintained. If the given element can not be found in the heap, this method should do nothing and return false.

    C. Override toString to return a String representation of the heap in breadth-first order, level by level. For example, for the min-heap you constructed in Lab8 via inserting 9 down to 0 into the heap, its string represention should look like this:

     0
     1 4
     3 2 8 5
     9 6 7
    

    If you inserted 9 down to 0 into a max-heap instead, its string representation should look like this:

     9
     8 7
     6 5 4 3
     2 1 0
    

    Hint: You may want to use a Stack.

2. Top Candidates

While a heap is usually required only to return the maximum (or minimum) element, since this heap will be used to store polling data, it may be interesting to us to retrieve the top few candidates.

Requirements:

  1. Add a method ArrayList<E> peekTopN(int n) that returns the top elements of the heap in order. If the heap has less elements than n return all elements in the heap in descending order. The heap should not be any different after the method was called than it was before the method was called, i.e., this is similar to peek in that it does not remove the top element. You should not implement this method by removing and then reinserting each element, as this has the potential to modify the heap.

  2. Describe your design of the peekTopN method in your README file and give a big-O analysis.

3. Command Line Input

As in the previous assignment, you will take filenames that store polling data as arguments to your main method. Your resulting heap should contain the polling data for each candidate from the most recent date for which there is data from the files given on the command line. The resulting heap should be ordered so that the candidate with the highest percentage of voters in the most recent poll is at the top of the heap.

Additionally, you will add an option for the users to provide a flag that will run your peekTopN method to determine and print out the top N candidates. The resulting arguments you should handle will look like this: -n 5 dempres_20190210_1.csv dempres_20190210_2.csv. In this case, the top 5 candidates would be displayed.

Finally, you will add another optional argument to remove some candidates from consideration. The full set of arguments you should handle will look like this: -n 5 -r Biden Bloomberg dempres_20190210_1.csv dempres_20190210_2.csv In the above case you would print out the top 5 candidates who are not Biden or Bloomberg based on the polling data in the given files.

You will need a PollingData class. Start by copying over your implementation from the previous assignment. Note that this heap is sorted by polling percentage rather than last name. Modify your PollingData class to reflect this. Keep in mind that insert should still replace a candidate element with the same last name.

Requirements:

  1. Take filename input from the command line into the main method of your DriverHW06.java. You may be given multiple filenames. Also note that just like in HW02, you should expect that the input filenames may have additional (/- separated) path in front of the filenames.

  2. Process an optional flag to print out the top N candidates. These should be printed out like this (where the example given is for N = 2):
    Top 2 Candidates:
    Joseph R. Biden Jr.:29.0
    Kamala D. Harris:15.0
    

    The above printout should happen once after all polling data has been inserted.

  3. Process the optional −r flag to remove candidates. This should be done so that the top N candidates do not include any removed candidates if they are optionally printed out. However you should only perform these remove operations once. One suggested order for handing these flags is to 1) insert all the polling data (printing out the heap after each file is inserted), 2) remove the candidates, and 3) show the top N candidates. In other words, it is expected that when you print out the heap it will include the candidates that will later be removed.

  4. You many assume that the list of filenames is always last, i.e. the first non-flag argument you encounter is assumed to be the beginning of the list of file names. Make sure you error-check thoroughly, both the arguments to the flags and the flags themselves. Your program should behave rationally no matter how unreasonable the input or the value of flags.

Remember the order of flags should not matter, that is, -n 5 -r Biden Bloomberg dempres_20190210_1.csv dempres_20190210_2.csv and -r Biden Bloomberg -n 5 dempres_20190210_1.csv dempres_20190210_2.csv will generate the same output.

  1. As in the previous assignment, you should print out the heap after each polling file is inserted.

Sample Input:

answer,candidate_name,pct
Biden,Joseph R. Biden Jr.,25
Sanders,Bernard Sanders,16
Harris,Kamala D. Harris,8
Warren,Elizabeth Warren,8
O'Rourke,Beto O'Rourke,7
Booker,Cory A. Booker,5

Output:

Joseph R. Biden Jr.:25.0
Bernard Sanders:16.0 Kamala D. Harris:8.0
Elizabeth Warren:8.0 Beto O'Rourke:7.0 Cory A. Booker:5.0

Submitting

Submit the following files to the assignment called HW06 on Gradescope

  1. README: The usual plain text file README
    • Your name:
    • Discussion: Design of peekTopN as explained above
    • How much time you spent on the assignment
    • What did you learn?
    • (Optional): Any feedback or comments
  2. Source files:
    • ArrayHeap.java
    • ArrayBinaryTree.java
    • PollingData.java
    • DriverHW06.java

Copying the files

You will likely want to submit the files from your own laptop/computer. To copy the files from goldengate to your own machine, open up a terminal on your own computer. If you have a mac or linux, you can use the Terminal application, if you have a windows you can use the Powershell app.

Run the following command:

$ scp <username>@goldengate.cs.brynmawr.edu/home/<username>/cs151/homeworks/hw00/*

Make sure to replace <username> with your username. scp is a command to secure-copy files from one server to another. In this case, from goldengate to your own machine.