This FAQ is a compilation of the most frequently asked questions. It is NOT a tutorial. You should still use the tutorial pages find explanations of features and concepts. You may want to use your browsers Edit => Find to search for help or use our index below. The index has been divided into these sections to make it easier to find help:
| Tunnels and Connectivity | Problems with the RLI tunnel, connectivity, and ssh. |
| The Remote Laboratory Interface | Problems with using the basic RLI features but excluding filters, queues, and plugins. |
| Filters, Queues and Bandwidth | Problems with filters, and queues. |
| Router Plugins | Problems with using and writing router plugins. |
| The SYN Demo | Problems running the SYN demonstration in Tutorial => Examples => The SYN Demo |
| Unix Commands | Problems with Unix commands such as source, ping, netstat, and iperf. |
Selecting a link from this table will take you to the Questions section. If you find a potentially helpful question, select the Q-label and that link will take you to the question/answer in the Questions and Answer section.
Warning: Permanently added the RSA host key for IP address '10.0.1.3' to the list of known hosts.what does it mean and whats wrong?
onlusr> source /users/onl/.topology.cshI get an unexpected end of file.
$n1p2> ssh -g -L 8080:n1p3:80 n1p3"Connect to host n1p3 port 22: Network is unreachable." What could be the problem?
onlusr> source /users/onl/.topology.cshI get an unexpected end of file.
http://localhost:8080/~YourUserName/syndemoto the browser I get Unable to connect and Firefox can't establish a connection to the server at localhost:8080.
(cd .www-docs/syndemo/Images; touchme &)the script did not run. The following error message was shown in the command-line window:
-bash: cd: .www-docs/syndemo/Images: No such file or directory -bash: touchme: command not found [1]+ Exit 127 touchme
sudo /usr/local/bin/sec/synsterthe following message appeared in the window:
"Sorry, user mndd is not allowed to execute '/usr/local/bin/sec/synster' as root on onl41.arl.wustl.edu."where onl41.arl.wustl.edu is the external address of n1p1a (where the attacker resides).
Warning: Permanently added the RSA host key for IP address '10.0.1.3' to the list of known hosts.what does it mean and whats wrong?
A1: It looks like you did not build your RLI tunnel. See the "Getting Started" link in the sidebar of the ONL web page. The least troublesome way to build the RLI tunnel is if you can run the ssh command from the command line:
If you are using a graphical tool like PuTTY or SSH client, you will have to follow precisely the steps given in the Getting Started sidebar. The precise steps for building the RLI tunnel are given at the RLI SSH Tunneling link on that page. If you are taking a course that is using ONL, someone should be assigned to help you with this if you have problems.ssh -L 7070:onlsrv:7070 onl.arl.wustl.edu
Warning: Permanently added the RSA host key for IP address '10.0.1.3' to the list of known hosts.what does it mean and whats wrong?
A2: The short answer is that there is nothing wrong.
10.0.1.3 is the IP address of the eth0 interface to the host onlusr; i.e., the ONL user host. You can see this by enterring:
while logged into onlusr and note the inet addr field in the eth0 entry. You didn't say so, but my guess is that you got this message when you tried to build an SSH tunnel from one ONL host to another. More specifically, you must have enterred the ssh command FROM onlusr to some other onl host.onlusr> /sbin/ifconfigWhenever you SSH to a remote host X from host Y, the IP address of host Y (the FROM host) is looked up in the file ~/.ssh/known_hosts (a plaintext file with RSA keys) at the remote host X. If it is there, then you are connected to that host. If not, then SSH will add the hostname to the file after authentication. In your case, I noticed that your ~mndd/.ssh/known_hosts contains an entry for 10.0.1.3 as its first entry ... which makes sense. This is why ...
Your onl home directory is NFS mounted on every onl host which also means that the file ~mndd/.ssh/known_hosts is accessible on every onl host. Suppose that you are on onlusr, and you enter something like:
All of your onl hosts (given to you through File => Commit) are setup to accept the ssh connection without asking for a password. But the ssh server (daemon) running on onl31 will still do some authentication. One thing it does is look at the file ~mndd/.ssh/known_hosts on onl31 to see if the IP address of onlusr (10.0.1.3) is a host that you have allowed to login to onl31 before. The first time you do this, there is nothing in the known_hosts file. Since you are allowed to login to your onl hosts from other onl hosts, ssh adds the host to the known_hosts file.onlusr> ssh onl31Enter the command "man ssh" and scroll down to the section "Server authentication" for more details.
A3:
+ You can not log into an ONL host unless you have an ONL account. This means that you must have either registered for an account through the ONL Web page or you received a predefined login name as part of a course/tutorial (and an email).
+ You can only ssh into onl.arl.wustl.edu from outside of the testbed. Once the ssh succeeds, you will end up on the host acting as the user host (currently onlusr).
+ You can only ssh into other hosts after they have been commited to you; i.e., wait for the experiment commit to finish first.
A4: All requests from the RLI to the testbed go through the ONL Proxy Daemon. This type of error usually means that the connection between the RLI and that Proxy Daemon either was lost or never established. Here are some possibilities:
- The Daemon died.
Possible, but unlikely, since we have been using the system intensely in the last 1.5 weeks.
>> If so, just try again. I used the system this morning with no problems.- Your SSH tunnel was incorrectly created.
Possible.
>> Try again, but do this. I am told that every MAC has the Open SSH command. So, build the tunnel through the command line by entering:Leave the window open, and try committing a simple experiment:ssh -L 7070:onlsrv:7070 onl.arl.wustl.edu
- Start up the RLI and try to commit one cluster ...
- Make a reservation using the RLI: File => Make Reservation
- Add a cluster: Topology => Add Cluster
- Ask for resources: File => Commit
++ If the tunnel is BAD, File => Make Reservation will fail.
++ If the message is Unable to connect: couldn't get I/O for 127.0.0.1, then the tunnel was never built or you left off the -L flag or something like that.
++ If the message is IO Exception, the request is getting out of your machine, but the RLI didn't get a response. Typically, that means the request didn't get to the Daemon. And right now, I would say the cause is something in your ssh command line.
++ It is possible that this part succeeds and then you later get the IO Exception error message. This would mean that your tunnel is OK but something happened (see below).- The SSH tunnel is OK but something at your end is causing the problem.
Possible. There are many possibilities:
- Your shell has autologout set meaning that after so many seconds, it will automatically log you out and terminate the session. But if so, when you lose the connection, it will display "auto-logout". Some shells have an environment variable TMOUT for this. "echo $TMOUT" will tell you if it is set to anything.
- Your SSH has a timeout feature. This is NOT typical on the client side. There is a server-side setting to kill idle connections, but our server doesn't do that.
- Your departmental/organization firewall or NAT box may have a timeout feature that will disconnect you if it sees no traffic for say 10 minutes. Users from some small universities have had this problem. You have to talk to your network adminstrator about this. I AM GUESSING THAT THIS IS YOUR PROBLEM, BUT THAT IS JUST A GUESS.
A1: No. The reservation should be for those parts where you actually need to commit (bind) actual resources. You can either do that through advanced reservations (see sidebar) or the RLI will pop up a dialogue box that allows you to do it when you commit. But if the testbed is very busy, it is best to make an advanced reservation.
A2: Yes, the RLI changes every once in a while. And it does complain if the version is old enough. We usually announce new versions to those using ONL as part of a course. The procedure for getting the RLI.jar file is the same as it has always been. You have two options:
1) Use HTTP: Click the "Get RLI.jar" link in the "Getting Started" page to download it from the Web. [[ If the resulting file is not the one above, then perhaps you need to flush your browser cache ... this should not be necessary unless you have a long lived www connection ]]
2) Use scp: The HTTP version is really obtained from onlusr.arl.wustl.edu:~onl/export/RLI.jar. So, you can SSH into onlusr and copy it from /users/onl/export/RLI.jar.
A3: Normally, this should not happen. But occassionally, an NSP or host can fail to properly initialize. If the NSP initialization fails, then close the experiment (File => Close) and try again. In rare cases when there are catastrophic hardware problems, all NSPs can end up in the repair state leaving no available NSPs. This situation can not be resolved until the staff fixes the underlying problem. If a single host or link fails, you can continue to use the NSP if you don't need that particular part of the setup. An email about the failure is sent to our staff, but the NSP is not placed in the repair state.
A4: The reservation is not considered to be in use until you commit. Do not ignore the message because indeed your reservation will be canceled because all reservations left unused for the first 30 minutes of the reservation period will be canceled. Some advice:
1) Make the beginning time of the reservation for when you think you will commit; and
2) Do a File => Commit even if you are not done with the network topology.
After the first commit, we assume that you have arrived for your reservation and we will not bother you anymore until near the end of the reservation period when you will get a warning message. But the RLI will pop up a dialogue box that asks if you want to extend your reservation period. If it is possible, the reservation will be extended. Even if the reservation is not extended, you can continue to work as long as no one else makes a reservation that will require your NSP.
A5: Nothing. Email is automatically sent to our staff, and someone will look into the problem. But since reservations are now overbooked, we have to look at the NSP, fix the problem and put it back into service before there are sufficient resources. Sometimes the problem can be quickly resolved, but it depends on the nature of the problem.
A1: The bandwidth in most cases is measured inside the switch fabric where IP packets are encapsulated inside ATM cells. These cells consist of a 5-byte header and a 48-byte payload leading to a 10% overhead. You can remove this overhead from the bandwidth charts by clicking on the label (e.g., OPP BW 3) to get a dialogue box that gives some details about the measurement point. Remove the check mark in the include ATM header check box. See Tutorial => The Remote Laboratory Interface => Features of Monitoring Panels and Tutorial => The Remote Laboratory Interface => Monitoring Concepts.
A2: Egress output rates are controlled by a token bucket regulator that has a granularity of around 54.1 Kbps; i.e., all egress rates are integer multiples of 54.1 Kbps.
A3: That menu item is actually selecting a monitoring point that is inside the switch fabric leading into output Port 3, not what is going out of the link attached to Port 3. If you really want to see the bandwidth going out of Port 3 and you are sending fixed length packets, you could monitor Port 3 => FPX Counters => Egress Packets and multiply the packet count by the length of the packet using the Formula feature of monitoring charts. See Tutorial => The Remote Laboratory Interface => Monitoring Concepts.
A4: The egress link rate is controlled by an FPX token bucket regulator. The current implementation has this behavior. We are looking into changing it to conform more to what you would expect where the interarrival times are the same for packets of the same size. See Tutorial => Filters, Queues and Bandwidth => NSP Architecture => Link Rate.
A1: The plugins have to be written in C, not C++. All variable declarations in C have to be at the beginning of a block. They can't appear randomly through out the code as they can be in C++.
A2: You should:
If I know that I will be outputting debug messages, I just change the first MSRDEBUG call in the handle_packet routine. But I will sometimes change the handle_message routine in a way that I can tell if I have the new plugin instance loaded. For example, if I keep message type 0 to be a Hello message, I could return the version number as part of the reply message.
- Delete the plugin instance
- Unload the plugin
- Create an instance of the plugin again
A3: Plugins are in the kernel, and the kernel doesn't have floating point. You will have to do it in integer and perhaps use approximations. For example, 0.01 is 1/100. It is a pain when going to smaller fractions. That's why if you look at something like Van Jacobson's RTT estimation calculation it involves powers of 2 so that it can be done using the shift operator ... i.e., x/8 is x>>3.
A4: Plugins are kernel code, and the kernel doesn't have these functions. You will have to code them yourself. But pow() and ceil() are trivial. log() is not trivial. But I suggest you approximate log(x). Your application is probably using a limited range of x. So, use small table of log values and use linear interpolation. Or, use the first few terms of a Taylor series scaled to be integer. What a pain. I would just do a very crude approximation using linear interpolation. You can't be expected to write a real kernel version of log(x) for a 2 week project.
A5: This is kernel programming ... there is no such thing as stdlib.
I describe a workaround. Yes, it uses rand() which doesn't generate very good random numbers, but who cares right now.
Here is what you do:
- Copy rand.c from the directory ~onl/stdPlugins/dropdelay-610/ to yours:
cp ~onl/stdPlugins/dropdelay-610/rand.c .- Change your Makefile so that you list rand.c:
SRCS=$(KMOD).c rand.c- Compile as before. Make sure that you check for undefined symbols as in any standard Makefile (look at the one in the dropdelay-610/ directory).
A6: You need to do a global replace of "pdelay" with "lab4" in both lab4.c and lab4.h. If you use vim, do something like:
... make a backup copy of lab4.c ... vim lab4.c :g/pdelay/s//lab4/g :wq
A7: It is not an error because the kernel frees the buffer after it forwards it in the delay plugin. If we were really good, we would have defined a msr_drop_pkt function which would encapsulate the freeing of the buffer and you wouldn't even know it was happening when you called it. But we didn't. So, stats.c shows the explicit dropping because the kernel has no idea that the buffer needs to be freed.
A8: You are saying that there seems to be this background traffic of 6.8 Mbps, right? What you need to do is turn off Distributed RP-Queueing. RLI.jar should turn OFF DQ by default. Do this:
i.e., Click the center of the NSP. The DQ algorithm automatically computes ingress side VOQ rates, but in doing so, generates around 6 Mbps of control traffic. So, turn it off. Now, the default VOQ rates will be static (default = 600 Mbps) and there will be no extra control traffic.NSP => Queueing ... Make sure that DQ is NOT checked ...
A9: Use assign_udpCksums((iphdr_t *) iph) where iph is a pointer to the IP header. assign_udpCksums is defined in /users/onl/wu_arl/msr/rp/plugins/include/ipnet.h and is included by #include <plugins/include/ipnet.h>. But remember that all of the remaining fields in the IP and UDP headers must already have their final values; i.e., you don't want to compute the checksum and then decide to change one of the header fields.
onlusr> source /users/onl/.topology.cshI get an unexpected end of file.
A1: This looks like you are trying to source a c-shell script when you are actually running the bash shell. Yes, when I enter:
I see files in your home directory like .bashrc. And when I enter:ls -al ~mnddI see:ypcat passwd | grep mnddwhich indicates (last field) that your shell is bash and not csh. So, you need to do this:sec:x:5261:5005:max nobody:/users/mndd:/bin/bashi.e., source the file .topology, NOT .topology.csh.onlusr> source /users/onl/.topology
A2: Right. It looks like the default command search PATH for most users does not contain the current directory ("."). That means that if touchme is in the current directory, you will need to enter ./touchme in order for your shell to find the script. Also, since it is a script, check that it has execute permissions.
A3:
+ Are you pinging from the correct host; i.e., usually not onlusr?
+ Are you pinging to the correct host?
+ Have you installed routes in both the forward and reverse directions? (The brute force method: 'Topology => Generate default routes' will generate default routes on all ports) (Note: This should not be necessary if you are using a predefined configuration file unless your instructor says that you need to define routes.)
A1: There can be many reasons, but it looks like you (and the rest of the students in your class) need to make your home directories world searchable.
I will change all of your home directory permissions to 711 (rwx --x --x) so you don't have to do this.
REPEAT: You do not have to do anything because I have already changed the permissions on JUST home directories.But if you want to know how to do it on your own and test it easily, read on. Do this:
where UserName is your username. This makes your home directory searchable by everyone including Apache.chmod 711 ~UserNameI tested this idea by changing the permissions on the home directory and then using a test tunnel that goes only over the control network directly to the host $n1p3 by doing this (replace UserName with your username and $n1p3 with the external interface name of the host on port 3):
- Make sure .www-docs directory is world readable/searchable
chmod 755 ~/.www-docs- Put a test web page in the .www-docs subdirectory
~mndd/index.html just displays a "Hello" message.cp ~mndd/index.html ~/.www-docs/- Build test tunnel:
This tunnel goes from your client to onlusr to $n1p3 over the control network. It doesn't use the NSP router.ssh -L 8080:$n1p3:80 onl.arl.wustl.edu- Enter http://localhost:8080/~UserName to your browser
You should see the message "Hello from ONL".
A2: It looks like you did not build your RLI tunnel. See the "Getting Started" link in the sidebar of the ONL web page.
$n1p2> ssh -g -L 8080:n1p3:80 n1p3"Connect to host n1p3 port 22: Network is unreachable." What could be the problem?
A3: I assume that you issued the following command from $n1p2 (where $n1p2 is either onl32, onl38, onl26 or onl11 depending on what NSP was assigned to you during the commit):
This means that you want to create a tunnel for port 8080 on $n1p2 going to port 80 on n1p3 and that the terminal session will log onto the host n1p3 through the NSP. Port 80 is the usual port that Apache listens for requests.$n1p2> ssh -g -L 8080:n1p3:80 n1p3The "Connect to host n1p3 port 22: Network is unreachable" typically means that either there is a hardware problem along the physical path to n1p3 or that there is no route to n1p3 or there is a problem along the return path from n1p3.
I suspect that the real problem is that you have no route established from n1p2 to n1p3; i.e., the route table at port 2 is not configured properly. So, now the question is exactly what is the problem?
- Step 1 (Find out if there is connectivity from n1p2 to n1p3)
Try to ping n1p3 from the host $n1p2 (you should replace $n1p2 with the correct interface name):If you get a normal response, the route is ok. If not, there is no route to n1p3. Assuming that you get no response ...onlusr> ssh $n1p2 $n1p2> ping n1p3- Step 2 (Look at the route table at port 2 of the NSP)
In the RLI, select on the NSP icon:You should see default route entries:Port 2 => Route TableThe 4th entry is the important one. If the RT is empty, you should generate the default routes for all ports (if you are using the "syndemo.exp" config file, the default routes should already have been created ... let me know if you are using the file).prefix/mask next hop 192.168.1.16/28 0 192.168.1.32/28 1 192.168.1.64/28 2 192.168.1.80/28 3 <== The important one ... etc ...
Create default routes for all ports by going to the RLI main menu and selecting:Now, your ping command in Step 1 should succeed. So, repeat Step 1. If successful try making the tunnel again.Topology => Generate Default Routes- Step 3 (Try creating the gateway tunnel again)
$n1p2> ssh -g -L 8080:n1p3:80 n1p3
A4: Assuming that your ONL login name is mndd, I see that you are missing the file /users/mndd/.www-docs/syndemo/Images/touchme.
onlusr> source /users/onl/.topology.cshI get an unexpected end of file.
A5: This looks like you are trying to source a c-shell script when you are actually running the bash shell. Yes, when I enter:
I see files in your home directory like .bashrc. And when I enter:ls -al ~mnddI see:ypcat passwd | grep mnddwhich indicates (last field) that your shell is bash and not csh. So, you need to do this:sec:x:5261:5005:max nobody:/users/mndd:/bin/bashi.e., source the file .topology, NOT .topology.csh.onlusr> source /users/onl/.topology
http://localhost:8080/~YourUserName/syndemoto the browser I get Unable to connect and Firefox can't establish a connection to the server at localhost:8080.
A6: This last problem indicates that you have not properly setup the tunnel at your client host so that traffic to port 8080 will go to the relay node. In the instructions for Approach 1, it says that you need to build the A-B-C tunnels like this:
where you need to replace $n1p1a, $n1cp and $n1p2 with the appropriate ONL host names which you can get by clicking on the n1p1a icon, the NSP icon, and the n1p2 icon (e.g., onl35).client> ssh -L 5050:$n1p1a:5050 -L 3552:$n1cp:3552 -L 8080:$n1p2:8080 onl.arl.wustl.eduThese instructions are about 3/4 of the way down the SYN Demo Web page.
(cd .www-docs/syndemo/Images; touchme &)the script did not run. The following error message was shown in the command-line window:
-bash: cd: .www-docs/syndemo/Images: No such file or directory -bash: touchme: command not found [1]+ Exit 127 touchme
A7: I am guessing that you ran it from some directory other than your home directory.
I looked in your home directory ~mndd and you have the directories:So, that looks ok. The command you ran needs to be run from your home directory or you need to change it so that the path .www-docs/syndemo/Images is correct. I just su'd to your account and tried the command and it worked as expected. So, try on your $n1p3 host:.www-docs/ .www-docs/syndemo/ .www-docs/syndemo/Images/The first command changes to your home directory. The second one executes the touchme command. If you want to kill the touchme command:cd (cd .www-docs/syndemo/Images; ./touchme &)where X is the PID of the touchme script.ps -l kill -9 X
sudo /usr/local/bin/sec/synsterthe following message appeared in the window:
"Sorry, user mndd is not allowed to execute '/usr/local/bin/sec/synster' as root on onl41.arl.wustl.edu."where onl41.arl.wustl.edu is the external address of n1p1a (where the attacker resides).
A8: I pushed out a new sudo file to resolve this second problem. I tried it with your account on onl41, and it worked. So, try it now. The problem should go away.
A9: Right. It looks like the default command search PATH for most users does not contain the current directory ("."). That means that if touchme is in the current directory, you will need to enter ./touchme in order for your shell to find the script. Also, since it is a script, check that it has execute permissions.
A10: [[ THIS IS THE IMPORTANT PARAGRAPH ]]
It looks like you told the RLI that your plugin directory was /users/mndd/plugins. If so, it assumes that the directory will contain sub-directories with names of the form XXX-NNN where XXX is a plugin name and NNN is a plugin number; e.g., syn_demo-54203. Inside this subdirectory ~mndd/plugins/syn_demo-54203/ should be the code to that plugin.I just looked at your directory /users/mndd/plugins and it looks really messed up. But it now does have the subdirectory syn_demo-54203. So, you should no longer be getting that message ... even though the rest of the directory is messed up.
But if I were you I would just use the standard syn_demo plugin stuff supplied by default. Only if that plugin worked would I attempt to build my own version of the plugin. I think the config file in ~onl/export/Examples/syndemo/syndemo.exp uses the default plugin located at ~onl/stdPlugins/syn_demo-54203.
But if you really do want to create your own version of the plugin, and you are still having problems, I would recreate your plugin directory to conform to what I described above.
A11: It is very hard for me to debug this without knowing more information.
It sounds like you are saying that the browser cycling through a sequence of images (5?) and that you can control the image transmission by clicking the Stop/Start button in the Web Client. Right?
Q1: Does the Bandwidth monitor show the Attacker traffic?
I suspect yes. If not, that would be really strange.Q2: Did you deviate from the instructions in anyway? If so, how?
Q3: What does the 'netstat' command tell you about traffic coming into the $n1p3 host? For example, do this WHILE IMAGE TRAFFIC IS GOING TO YOUR BROWSER:
The difference between the second and first RX-OK numbers indicates how much traffic came into the host over that interface in 10 sec. The difference between the second and first TX-OK numbers indicates how much traffic went out of the host over that interface in 10 sec. The interface with the smallest numbers is probably the one going into the NSP. The other interface is connected to the control network.onlusr> ssh $n1p3 $n1p3> netstat -i ... You should see 3 lines (eth0, eth1, lo) ... ... Record the RX-OK and TX-OK numbers ... ... Wait about 10 sec ... $n1p3> netstat -i # again ... Record the RX-OK and TX-OK numbers again ...Unfortunately, the control network is sometimes on eth0 and sometimes eth1. So, I can't tell in advance which interface it would be ... unless I knew the external host name (e.g., onl31, onl37, onl25, or onl10). Now, in these 4 cases, I think eth0 is attached to an NSP and eth1 is attached to the control network.
The command "ls -l ~/.www-docs/syndemo/Images/*.jpg" indicates the sizes of the images range from 83735 bytes (evening.jpg) to 108161 bytes (nudy.jpg); i.e., about 100 KB per image. So, we expect that if image requests are sent every 3 sec, that in 10 seconds, you will see a difference in outgoing traffic (image) of about 300,000 bytes and a small amount of incoming traffic (HTTP request).
If I had to guess, the image traffic must be going over the control network instead of the private network (NSP). And that would only happen if Tunnel D were not built properly. This is only a guess. But if you did this:
instead of this CORRECT WAY:$n1p2> ssh -g -L 8080:$n1p3:80 $n1p3when building Tunnel D what you are seeing would happen. This last tunnel forces the traffic coming into port 8080 of the relay node to go to port 80 of n1p3 over the interface to the NSP.$n1p2> ssh -g -L 8080:n1p3:80 n1p3Another possibility is that you built Tunnel D ok, but for some strange reason, the route going from n1p2 to n1p3 was not built properly and somehow the relay host $n1p2 found a path over the control network to $n1p3 (unlikely, but possible).
Of course, if the above fails, we could verify all of this ... painfully ... using tcpdump on the right interfaces along the traffic path and deciphering the output.
Warning: Permanently added the RSA host key for IP address '10.0.1.3' to the list of known hosts.what does it mean and whats wrong?
A12: The short answer is that there is nothing wrong.
10.0.1.3 is the IP address of the eth0 interface to onlusr. You can see this by enterring:
while logged into onlusr and note the inet addr field in the eth0 entry. You didn't say so, but my guess is that you got this message when you tried to build an SSH tunnel from one ONL host to another. More specifically, you must have enterred the ssh command FROM onlusr to some other onl host.onlusr> /sbin/ifconfigWhenever you SSH to a remote host X from host Y, the IP address of host Y (the FROM host) is looked up in the file ~/.ssh/known_hosts (a plaintext file with RSA keys) at the remote host X. If it is there, then you are connected to that host. If not, then SSH will add the hostname to the file after authentication. In your case, I noticed that your ~mndd/.ssh/known_hosts contains an entry for 10.0.1.3 as its first entry ... which makes sense. This is why ...
Your onl home directory is NFS mounted on every onl host which also means that the file ~mndd/.ssh/known_hosts is accessible on every onl host. Suppose that you are on onlusr, and you enter something like:
All of your onl hosts (given to you through File => Commit) are setup to accept the ssh connection without asking for a password. But the ssh server (daemon) running on onl31 will still do some authentication. One thing it does is look at the file ~mndd/.ssh/known_hosts on onl31 to see if the IP address of onlusr (10.0.1.3) is a host that you have allowed to login to onl31 before. The first time you do this, there is nothing in the known_hosts file. Since you are allowed to login to your onl hosts from other onl hosts, ssh adds the host to the known_hosts file.onlusr> ssh onl31Enter the command "man ssh" and scroll down to the section "Server authentication" for more details.
A13: runTcpMon.pl produces the file tcp.data in your home directory. Your file contains all 0s. That script is looking at the file /proc/net/tcp. Enter the command "cat /proc/net/tcp" on any of the onl hosts and you will see that the file contains the state of TCP connections to that host. The script is just looking for connections to port 80 (HTTP) (or hex 0050) that are in state 03 (partial connection). I am guessing (and this is only a guess) that the runTcpMon.pl script is running on the wrong host. It should be running on your $n1p3 host.
A14: The fact that you got traffic for 30 seconds is good because it shows that things are working. The fact that it stops after 30 seconds will be difficult to debug. Furthermore, the demo is fragile. So, if anything goes wrong, it doesn't recover well. Some comments:
- I would first try it again and see if this behavior repeats.
- If the touchme script is still running, the problem is probably NOT due to the browser thinking that it already has up-to-date images. You can tell that the modification dates of the images are in fact being updated:
cd ~/.www-docs/syndemo/Images ls -l *.jpg ... wait 10 seconds ... ls -l *.jpg ... repeat occassionally to see that the modification dates are changing- I don't know if the RTT between you and us would be an issue. It must be about 150-200 milliseconds. It shouldn't be an issue since the traffic volume is low.
- I don't know if network security at your end would be a problem. Again, I don't think so since you are able to begin an experiment.
- I am guessing that either the Web server or the plugin got overwhelmed. I would try turning the attacker off when the images stop and see if that allows outstanding requests to drain. Then turn it back on. And repeat this process. But if you are having this problem when the attacker has never been turned on, that would be a different problem.
A15: I just tested this on your Web page and it worked. So, there is no problem with your files in ~mndd/syndemo/.
The fact that you got "Page not found" probably indicates that you are communicating with some HTTP server, but I suspect the wrong one. Probably some tunnel was not built properly. But it is hard to tell. You will have to find out where the traffic is going by working backwards from the HTTP server $n1p3.
For example, if you monitor the ingress and egress bandwidth at port 3 of your NSP, you should see some traffic going out of and coming into that port when you hit the carriage return on the URL. If your http request got to $n1p3 but the page was not found by Apache on $n1p3, you should see two spikes in the plot, one for egress (the request) and then one for ingress (the error message).
If you don't see traffic at port 3, then go to port 2 and repeat the monitoring process.
If you don't see traffic at port 2, then look at the relay node's ($n1p2) network interfaces using "netstat -i" before you enter the URL and after.
My only other suggestion is to talk to your fellow student Mohammad Firas (login toshiba). He seems to have gotten most of the demo working.
Revised: Thu, Feb 1, 2007