This FAQ is a compilation of the most frequently asked questions. It is NOT a tutorial. You should still use the tutorial pages find explanations of features and concepts. You may want to use your browsers Edit => Find to search for help or use our index below. The index has been divided into these sections to make it easier to find help:
| Tunnels and Connectivity | Problems with the RLI tunnel, connectivity, and ssh. |
| The Remote Laboratory Interface | Problems with using the basic RLI features but excluding filters, queues, and plugins. |
| Filters, Queues and Bandwidth | Problems with filters, and queues. |
| Router Plugins | Problems with using and writing router plugins. |
| Unix Commands | Problems with Unix commands such as source, ping, netstat, and iperf. |
Selecting a link from this table will take you to the Questions section. If you find a potentially helpful question, select the Q-label and that link will take you to the question/answer in the Questions and Answer section.
Warning: Permanently added the RSA host key for IP address '10.0.1.3' to the list of known hosts.what does it mean and whats wrong?
A1: It looks like you did not build your RLI tunnel. See the "Getting Started" link in the sidebar of the ONL web page. The least troublesome way to build the RLI tunnel is if you can run the ssh command from the command line:
If you are using a graphical tool like PuTTY or SSH client, you will have to follow precisely the steps given in the Getting Started sidebar. The precise steps for building the RLI tunnel are given at the RLI SSH Tunneling link on that page. If you are taking a course that is using ONL, someone should be assigned to help you with this if you have problems.ssh -L 7070:onlsrv:7070 onl.arl.wustl.edu
Warning: Permanently added the RSA host key for IP address '10.0.1.3' to the list of known hosts.what does it mean and whats wrong?
A2: The short answer is that there is nothing wrong.
10.0.1.3 is the IP address of the eth0 interface to the host onlusr; i.e., the ONL user host. You can see this by enterring:
while logged into onlusr and note the inet addr field in the eth0 entry. You didn't say so, but my guess is that you got this message when you tried to build an SSH tunnel from one ONL host to another. More specifically, you must have enterred the ssh command FROM onlusr to some other onl host.onlusr> /sbin/ifconfigWhenever you SSH to a remote host X from host Y, the IP address of host Y (the FROM host) is looked up in the file ~/.ssh/known_hosts (a plaintext file with RSA keys) at the remote host X. If it is there, then you are connected to that host. If not, then SSH will add the hostname to the file after authentication. In your case, I noticed that your ~mndd/.ssh/known_hosts contains an entry for 10.0.1.3 as its first entry ... which makes sense. This is why ...
Your onl home directory is NFS mounted on every onl host which also means that the file ~mndd/.ssh/known_hosts is accessible on every onl host. Suppose that you are on onlusr, and you enter something like:
All of your ONL hosts (given to you through File => Commit) are setup to accept the ssh connection without asking for a password. But the ssh server (daemon) running on onl031 will still do some authentication. One thing it does is look at the file ~/.ssh/known_hosts on onl031 (in your home directory) to see if the IP address of onlusr (10.0.1.3) is a host that you have allowed to login to onl031 before. The first time you do this, there is nothing in the known_hosts file. Since you are allowed to login to your onl hosts from other onl hosts, ssh adds the host to the known_hosts file.onlusr> ssh onl031Enter the command "man ssh" and scroll down to the section "Server authentication" for more details.
A3:
+ You can not log into an ONL host unless you have an ONL account. This means that you must have either registered for an account through the ONL Web page or you received a predefined login name as part of a course/tutorial (and an email).
+ You can only ssh into onl.arl.wustl.edu from outside of the testbed. Once the ssh succeeds, you will end up on the host acting as the user host (currently onlusr).
+ You can only ssh into other hosts after they have been commited to you; i.e., wait for the experiment commit to finish first.
A4: All requests from the RLI to the testbed go through the ONL Proxy Daemon. This type of error usually means that the connection between the RLI and that Proxy Daemon either was lost or never established. Here are some possibilities:
- The Daemon died.
Possible, but unlikely, since we have been using the system intensely in the last 1.5 weeks.
>> If so, just try again. I used the system this morning with no problems.- Your SSH tunnel was incorrectly created.
Possible.
>> Try again, but do this. I am told that every MAC has the Open SSH command. So, build the tunnel through the command line by entering:Leave the window open, and try committing a simple experiment:ssh -L 7070:onlsrv:7070 onl.arl.wustl.edu
- Start up the RLI and try to commit one cluster ...
- Make a reservation using the RLI: File => Make Reservation
- Add a cluster: Topology => Add Cluster
- Ask for resources: File => Commit
++ If the tunnel is BAD, File => Make Reservation will fail.
++ If the message is Unable to connect: couldn't get I/O for 127.0.0.1, then the tunnel was never built or you left off the -L flag or something like that.
++ If the message is IO Exception, the request is getting out of your machine, but the RLI didn't get a response. Typically, that means the request didn't get to the Daemon. And right now, I would say the cause is something in your ssh command line.
++ It is possible that this part succeeds and then you later get the IO Exception error message. This would mean that your tunnel is OK but something happened (see below).- The SSH tunnel is OK but something at your end is causing the problem.
Possible. There are many possibilities:
- Your shell has autologout set meaning that after so many seconds, it will automatically log you out and terminate the session. But if so, when you lose the connection, it will display "auto-logout". Some shells have an environment variable TMOUT for this. "echo $TMOUT" will tell you if it is set to anything.
- Your SSH has a timeout feature. This is NOT typical on the client side. There is a server-side setting to kill idle connections, but our server doesn't do that.
- Your departmental/organization firewall or NAT box may have a timeout feature that will disconnect you if it sees no traffic for say 10 minutes. Users from some small universities have had this problem. You have to talk to your network adminstrator about this. I AM GUESSING THAT THIS IS YOUR PROBLEM, BUT THAT IS JUST A GUESS.
A5: The RLI has a timeout mechanism that does not work well with a wireless network. You should always use a wired network when doing ONL experiments.
A1: No. The reservation should be for those parts where you actually need to commit (bind) actual resources. You can either do that through advanced reservations (see sidebar) or the RLI will pop up a dialogue box that allows you to do it when you commit. But if the testbed is very busy, it is best to make an advanced reservation.
A2: Yes, the RLI changes every once in a while. And it does complain if the version is old enough. We usually announce new versions to those using ONL as part of a course. The procedure for getting the RLI.jar file is the same as it has always been.
- Use HTTP: Click the "Get RLI" link in the NPR section in the sidebar of the ONL Web page. [[ If the resulting file is not the one above, then perhaps you need to flush your browser cache ... this should not be necessary unless you have a long lived www connection ]]
A3: Normally, this should not happen. But occassionally, an NPR or host can fail to properly initialize. If the NPR initialization fails, then close the experiment (File => Close) and try again. In rare cases when there are catastrophic hardware problems, all NPRs can end up in the repair state leaving no available NPRs. This situation can not be resolved until the staff fixes the underlying problem. If a single host or link fails, you can continue to use the NPR if you don't need that particular part of the setup. An email about the failure is sent to our staff, but the NPR is not placed in the repair state.
A4: The reservation is not considered to be in use until you commit. Do not ignore the message because indeed your reservation will be canceled because all reservations left unused for the first 30 minutes of the reservation period will be canceled. Some advice:
1) Make the beginning time of the reservation for when you think you will commit; and
2) Do a File => Commit even if you are not done with the network topology.
After the first commit, we assume that you have arrived for your reservation and we will not bother you anymore until near the end of the reservation period when you will get a warning message. But the RLI will pop up a dialogue box that asks if you want to extend your reservation period. If it is possible, the reservation will be extended. Even if the reservation is not extended, you can continue to work as long as no one else makes a reservation that will require your NPR.
A5: Nothing. Email is automatically sent to our staff, and someone will look into the problem. But since reservations are now overbooked, we have to look at the NPR, fix the problem and put it back into service before there are sufficient resources. Sometimes the problem can be quickly resolved, but it depends on the nature of the problem.
A1: Output port rates are controlled by a token bucket regulator that has a granularity of around 64 Kbps; i.e., all port rates are integer multiples of 64 Kbps.
A2: The output port rate is controlled by a token bucket regulator. The current implementation has this behavior. See NPR Tutorial => Filters, Queues and Bandwidth => Setting Port Rate.
A1: The plugins have to be written in Microengine C for the IXP, not C++. All variable declarations in C have to be at the beginning of a block. They can't appear randomly through out the code as they can be in C++.
A2: You should:
- Delete the plugin from the Plugin Table
- Recompile (e.g., "make clean; make ")
- Add the plugin to the Plugin Table
A3: The microengines don't have floating point. You will have to do it in integer and perhaps use approximations. For example, 0.01 is 1/100. It is a pain when going to smaller fractions. That's why if you look at something like Van Jacobson's RTT estimation calculation it involves powers of 2 so that it can be done using the shift operator ... i.e., x/8 is x>>3.
A4: The IXP library doesn't have these functions. You will have to code them yourself. But pow() and ceil() are trivial. log() is not trivial. But I suggest you approximate log(x). Your application is probably using a limited range of x. So, use small table of log values and use linear interpolation. Or, use the first few terms of a Taylor series scaled to be integer. What a pain. I would just do a very crude approximation using linear interpolation. You can't be expected to write a real kernel version of log(x) for a 2 week project.
A5: There is no such thing as a full stdlib available.
I describe a workaround. Yes, it uses rand() which doesn't generate very good random numbers, but who cares right now.
Here is what you do:
- Copy rand.c from the directory ~onl/stdPlugins/dropdelay-610/ to yours:
cp ~onl/stdPlugins/dropdelay-610/rand.c .- Insert the code into your plugin source code.
- Compile as before.
A6: Use onl_api_udp_cksum(). See NPR Tutorial => Summary Information => Plugin Functions and the stringSub plugin source code. But remember that all of the remaining fields in the IP and UDP headers must already have their final values; i.e., you don't want to compute the checksum and then decide to change one of the header fields.
onlusr> source /users/onl/.topology.cshI get an unexpected end of file.
A1: This looks like you are trying to source a c-shell script when you are actually running the bash shell. Yes, when I enter for the user mndd:
I see files in your home directory like .bashrc. And when I enter:ls -al ~mnddI see:ypcat passwd | grep mnddwhich indicates (last field) that your shell is bash and not csh. So, you need to do this:sec:x:5261:5005:max nobody:/users/mndd:/bin/bashi.e., source the file .topology, NOT .topology.csh.onlusr> source /users/onl/.topology
A2: Right. It looks like the default command search PATH for most users does not contain the current directory ("."). That means that if touchme is in the current directory, you will need to enter ./touchme in order for your shell to find the script. Also, since it is a script, check that it has execute permissions.
A3:
+ Are you pinging from the correct host; i.e., usually not onlusr?
+ Are you pinging to the correct host?
+ Have you installed routes in both the forward and reverse directions? (The brute force method: 'Topology => Generate default routes' will generate default routes on all ports) (Note: This should not be necessary if you are using a predefined configuration file unless your instructor says that you need to define routes.)
Revised: Wed, Jan 21, 2009