NPR Tutorial >> Writing A Plugin | TOC |
So far, when an error occurred in a simple plugin, we either ignored it or sent a debug message to the Xscale Control Processor. But ignoring the error may lead to confusion when trying to locate the root cause of misbehavior. Alternatively, sending a debug message to the Xscale has the disadvantage that the message volume can lock up the NPR. Another approach is to log an error code (perhaps along with auxilliary data) which the user can retrieve using a control message.
We used the error logging approach in the priq plugin to both handle errors and locate a difficult bug. (The priq plugin implements a form of strict priority queueing with three flow priorities (high, medium and low).)
// error codes #define BAD_QUEUE_INIT_ERR 1 // bad queue_init() #define BAD_ENQ_ERR 2 // bad queue_enq() #define BAD_POP_EMPTY_ERR 3 // bad queue_pop() - empty queue #define BAD_POP_FREE_ERR 4 // bad queue_pop() - free() failed #define BAD_NXTBLK 5 // bad nextBlk #define BAD_OUT_PORT 6 // bad QID in meta-pkt #define BAD_QID 7 // bad QID in meta-pkt #define BAD_HANDLE_A 98 // bad buffer handle in meta-pkt #define BAD_HANDLE_B 99 // bad buffer handle in meta-pkt = 0 __declspec(shared gp_reg) unsigned int nerrs; // #errors volatile __declspec(shared sram) unsigned int errno[5]; // 1st 5 errors static __forceinline void helper_set_errno( __declspec(local_mem) unsigned int n ) { if( nerrs < 5 ) errno[nerrs] = n; ++nerrs; onl_api_plugin_cntr_inc(pluginId, 0); // external error counter }
The code fragment above defines nine error codes. The function helper_set_errno() stores up to five error codes in the array errno[5]. The limit of five error codes was an arbitrary choice but is typical since we are usually interested in the earliest errors. Furthermore, we use Plugin Counter 0 as an error counter which we chart to get an indication of errors.
void plugin_init_user() { 1 if(ctx() == 0) 2 { 3 reset_counters( ); 4 5 queue_lock = UNLOCKED; 6 if( queue_init_free() != 0 ) { 7 helper_set_errno( BAD_QUEUE_INIT_ERR ); 8 } 10 queue_init_desc( &queue[SILVERQ], SILVERQ ); 11 queue_init_desc( &queue[BRONZEQ], BRONZEQ ); 12 } }
The code fragment above which was taken from the priq plugin shows a typical simple example of how helper_set_errno() is used.
The priq plugin also introduced the use of run-time checking with auxilliary data logging. During the development of the plugin, the plugin would run fine for thousands of packets and then mysteriously stop forwarding packets. Furthermore, when this lockup occurred, a queue length chart showed millions of packets, a sign that the Queue Manager was being confused from corrupted meta-packets.
void helper_check_meta( plugin_out_data my_ring_out ) { if( my_ring_out.plugin_qm_data_out.out_port != 4 ) {// check output port helper_set_errno( BAD_OUT_PORT ); } if( my_ring_out.plugin_qm_data_out.qid != 41024 ) { // check QID helper_set_errno( BAD_QID ); } // check buffer handle if( (my_ring_out.plugin_qm_data_out.buf_handle_lo24 & 0x7) != 0 ) { helper_set_xdata( my_ring_out.plugin_qm_data_out.buf_handle_lo24 ); helper_set_errno( BAD_HANDLE_A ); ++nerrsA; } if( my_ring_out.plugin_qm_data_out.buf_handle_lo24 == 0 ) { helper_set_xdata( my_ring_out.plugin_qm_data_out.buf_handle_lo24 ); helper_set_errno( BAD_HANDLE_B ); ++nerrsB; } }
To check for sane meta-packet fields, the priq plugin calls the helper_check_meta() function before forwarding a meta-packet to the Queue Manager. An error is indicated if any of the following are true:
static __forceinline void helper_send_from_queue_to_QM( __declspec(shared,sram) struct queue_tag *qptr ) { plugin_out_data my_ring_out; // ring data to next block int rc; my_ring_out.plugin_qm_data_out.out_port = qptr->hd->out_port; my_ring_out.plugin_qm_data_out.qid = qptr->hd->qid; my_ring_out.plugin_qm_data_out.l3_pkt_len = qptr->hd->l3_pkt_len; my_ring_out.plugin_qm_data_out.buf_handle_lo24 = qptr->hd->buf_handle; rc = queue_pop( qptr ); if( rc == -1 ) { helper_set_errno( BAD_POP_EMPTY_ERR ); } else if( rc == -2 ) { helper_set_errno( BAD_POP_FREE_ERR ); } if( debug_on ) helper_check_meta( my_ring_out ); scr_ring_put_buffer_3word( PLUGIN_TO_QM_RING, my_ring_out.i, 0 ); }
The helper_send_from_queue_to_QM() function above calls helper_check_meta() before it sends the meta-packet to the Queue Manager by calling scr_ring_put_buffer_3word(). The function helper_send_from_queue_to_QM():
#define NX 32 volatile __declspec(shared sram) unsigned int xdata[NX]; // extra data volatile __declspec(shared sram) unsigned int nxdata; // #xdata valid volatile __declspec(shared sram) unsigned int nxget; // next to send static __forceinline void helper_set_xdata( unsigned int x ) { // record auxilliary data if( nxdata < NX ) { xdata[nxdata] = x; ++nxdata; } }
The xdata[] array is handled in a similar manner to how we handled errno[] except for two differences. First, we recorded up to 32 values instead of five values. Second, only two of the possibly 32 auxilliary values could be returned with one =xdata control message request. The variable nxget indicates the next xdata[] element to be sent and allows the user to get all 32 values by issuing 16 =xdata requests.
The xdata[NX] array is also used to record other auxilliary data. For example, this approach was also used to locate a pointer bug associated with management of the free list.
Revised: Fri, Feb 13, 2009
NPR Tutorial >> Writing A Plugin | TOC |