Introduction:
There are many different ways to write a program to perform a given task. For
example, in Java there are at least three fundamentally different ways to write a
program. It is possible to place all of the code in the main method
and not use any method calls. For example, the following program sums up all of the
numbers from 1 to 1000:
Another approach would be to use a procedural programming approach and break the
program into subroutines implemented as static methods. Using a
procedural programming approach the program for summing the numbers from 1 to 1000
might appear as:
The third approach would be to use an object oriented approach, identifying the objects and instance methods necessary to complete the program.
These increasingly complex programming paradigms (unstructured, procedural and
object oriented) bring with them the ability to deal effectively with larger and
larger programs. However, along with this increased effectiveness for large
programs comes increased overhead that can decrease performance. In fact, you have
probably been told that method calls incur additional overhead that makes them
slower than including the same code in-line. For example, the statement
x=y*y; will execute more quickly than x=Math.pow(y,2.0);
because the method call overhead can be eliminated. Similarly, the
AllInMain program above will execute more quickly than the
StaticMethod program.
In theory, we should be able to use the above programs to measure the method
calling overhead associated with Getting Started:
static methods in Java. First, note
that the two programs perform, as near as possible, the same operations with the
exception of the static method call in the StaticMethod
class. Thus, if we run both programs and find the time spent executing the
operations in each program, the difference in these times will be the time spent
executing the method calls in the StaticMethod program. In this
project you will use this approach to measure and compare the overhead involved in
calling static and instance methods in Java.
To get started you will need a tool that allows you measure how much time each part of a program takes to be executed. You will then be able to use this tool, and some calculations, to determine how much overhead is incurred by calling methods in Java. Fortunately, the Java Virtual Machine (JVM) provides a profiler which is just such a tool. The profiler in the JVM will keep track of the total time it takes a program to execute and what percentage of that total time is spent in each method in the program.
As an example of how to use the JVM profiler we'll use the following variant of
the AllInMain program from above:
The AllInMain2 program still computes the sum of all of the numbers
from 1 to 1000. However, it repeats this operation 100,000 times. The reason that
it is repeated 100,000 times is so that the time used by the main
method dominates the total time required for the program to execute. If the sum is
only performed once, other operations such as initializing the JVM and loading the
class into the JVM take more time than the actual program.
To use the JVM profiler, you compile the program normally and use a command line
argument to the JVM to ask it to perform profiling. For example, to profile the
MainTest program you would use the command:
This command will run the MainTest2 program and record the
profiling data in a file named java.hprof.txt. Note, that you may see
several lines of output that say "HPROF ERROR". These are due to glitches in the
profiler and will not affect your results and can safely be ignored. The
java.hprof.txt file contains a lot of information. The information
that you will be interested in for this project appears near the end of the file
and will look similar to the following:
In the first line above, Collecting the Data:
total = 12400 indicates that the
MainTest2 program took a total of 12400 milliseconds (i.e. 12.400
seconds) to execute. The following lines provide profiling information about each
method called (directly or indirectly) by the MainTest program. The
rank column indicates which methods used the most time, with lower
ranks indicating the methods that used the most time. In the table above, the
main method in the MainTest class has rank 1, indicating
that it used the most time. The self column indicates the percentage
of the total time that was used by each method. For example, the main
method used 98.31% of the total time, while the charAt method in the
java.lang.String class used 0.08% of the total time. Notice that the
MainTest2 program does not explicitly use the
String.charAt or HashMap.hash methods and thus these
operations are used by the JVM during initialization and when loading the program.
The accum column maintains a running total of the percentage of total
time used by the methods with lower rank. For example, the last line in the table
above indicates that the methods with rank 1 through 7 used 98.79% of the total
time.
You should begin by collecting data on the amount of time the
MainTest program spends in the main method. However,
before collecting your data you should adjust the number of iterations in the
for (int j=0...) loop so that the main method uses at
least 95% of the total time. Also, be sure to collect and average the data from at
least 20 trials.
After collecting data for The Write ups:
MainTest you will need to conduct
additional experiments to collect data that will allow you to calculate the
overhead of a static method call and of an instance method call. The
StaticMethod example in the introduction should give you some good
ideas on how to get started. As you begin to design these additional experiments
you will need to be careful of several issues. First, ensure that you are measuring
only the overhead of the method calls. This will be particularly important when you
develop your object oriented program. Second, each method call takes a very small
amount of time. Thus, you will need to measure the time for a large number of
method calls and divide to find the time for a single method call.
There will be three written documents associated with this project. The first will be a project proposal written collectively by each group. The second and third documents will be a rough draft and final draft of a scientific paper describing the project. The rough and final drafts must be written individually.
Project Proposal
Your goal in the project proposal is to communicate exactly what you plan to do and how you plan to do it. Your proposal should begin by very briefly stating the goal of the project and then outlining how the goal will be accomplished. In particular you must describe the programs you plan to use and include their source code. You must also discuss precisely what data you will collect and what calculations you will perform to determine the method calling overheads. Note, it is not expected that you have collected any data at this point, only that you have thought very carefully about the data that you will need and how you are going to collect it.
Your proposal should be about two pages in length. You should write your proposal as if the target audience were another member of this class. Thus, the language and terminology that you use should be clear to your peers.
Rough and Final Drafts
Each individual will turn in both a rough draft and a final draft. The rough and final drafts should be 5-7 pages. These drafts will take the form of a scientific paper with the following sections:
Introduction:
In the introduction you will state the question that you have investigated and provide some motivation for why it is an interesting question. You will also want to provide a very general outline of how you will answer the question. However, be careful to avoid being overly specific in the introduction, the details of the experiment come later.
Background:
Your background section should serve to provide an uninitiated reader with enough information to understand the question that you are investigating and what makes it interesting. While the introduction stated the question and why it is interesting this section, as the name suggests, provides more background and a more detailed discussion.
In this paper, your background section will need to define and discuss unstructured, procedural and object oriented programming and the applications and trade-offs of each of them. You may find websites and introductory computer science textbooks, available in the library, helpful in improving your understanding of these programming paradigms. Be sure to reference any sources that you use. All references and citations should use the APA style.
Methods:
Your goal in the methods section is to describe in detail the experiments that you used to measure the method call overhead. You will need to discuss the programs that you used, the data that you collected, how you collected that data and the calculations that you performed. Based on what you say in this section, it should be possible for someone who reads your paper to precisely repeat your experiments. Note that to precisely repeat your experiment the reader will need to know what hardware, operating system, compiler etc. that you used.
Results:
In this section you will present the results of the experiments that you described in the previous section. This will include any tables, graphs or charts that are necessary to clearly communicate your results. Note that all tables, charts and graphs must have a label and a caption. In addition you will need to explain the meaning of the data in any tables and the significance of any charts or graphs. You should also include a brief discussion of each of the results in this section.
Conclusions:
In your conclusions you will discuss the answer to the question you set out to investigate as well as any implications of the answer that you found.