Niakwa, Inc.

    

 


What's News
NPL Products
NPL Support
NPL Downloads
System Integration
swiggle_small.gif (991 bytes)

swiggle_small.gif (991 bytes)

April 25, 2001 No. 99

Performance Tuning NPL in Windows Environments

As more and more users transition to Windows NT / 2000 networks from Novell and Unix, and to 32-bit applications from 16-bit applications, some of them observe performance anomalies.  This Tech Note is designed to clarify some configuration items that cause these performance issues, and to suggest possible solutions.

Contents:

1          Issue: Oplocks on Windows NT / 2000 Networks

Systems that experience this issue have these things in common:

  • Files are located on an NT server with opportunistic locking enabled (not local and not Novell)
  • Significant performance difference between one vs. multiple users accessing the same file

 1.1       Cause

What is opportunistic locking?  Opportunistic locking, or oplocks, is a special file locking mode that says, in effect, that if only one user has a file open in shared mode, that user can treat the file as if it's open exclusively.   This means it can cache data locally and, once data has been read from the server, never has to reference the server to find out if the data has changed.  In addition, the system can do read-aheads to pre-fill the cache if necessary, delay writes to the file, and do all kinds of other stuff that basically is only useful when you know you have exclusive access to a file.  For example, requests to lock and unlock the file are just discarded.  This all works fine, and can result in excellent performance.

The problem occurs when a second user opens the file in shared mode: this ‘breaks’ the oplock mode, and from then on neither user can cache data from the file locally, and requests to lock data in the file result in actual network traffic.   All these effects reduce performance dramatically.  Once an oplock is broken, it stays broken until everyone closes the file, so just having the second user quit won’t give it back.

To summarize, NT / 2000 allow very fast access to a file for only one user, and somewhat slower access to multiple users.  (This multiple user access is the primary situation in which people notice a decline in performance when moving to NT / 2000 from Novell or Unix.)  This will be true of any files, not just NPL related files.

1.2       Possible Solutions

1.2.1        Change Libraries to Read-Only

When a file is tagged as read-only in NT / 2000, opportunistic locking is enabled even when multiple users are accessing the file.  Thus performance on read-only files is always good.

Many applications already segregate the program files (excellent candidates for read-only status) and data files into separate libraries (diskimages).

However due to the nature of the NPL library (diskimage), especially in older applications, many developers have placed programs, data files, and configuration files all together in a library.  Thus before it is possible to change any libraries to read-only, it may be necessary to segregate those files which may be considered read-only into a separate library.

1.2.2        Minimize Use of Libraries by Multiple Clients

One obvious way to allow NT / 2000 to use opportunistic locking is to not have multiple clients require the same libraries or files.  While this may sound difficult, here are a few methods to consider.

Do not Share Temporary Files

If multiple clients all use work areas in temporary files, try to segregate those files so that each client is using a file with a unique name, rather than sharing one large file.

Move Files to Local Drive

In some instances it is possible to move some of your libraries or other files to the client’s local drive.  This not only avoids the oplock issues, but may also avoid network traffic issues.  (Incidentally, this can also be done with NPL executable files.)  However any time the files need to be updated, doing so may be considerably harder than if they were located on the server.

Implement NDM for Data Files

Since NDM accesses data through a data engine, such as Btrieve, Access, or SQL Server, opportunistic locking is not an issue.  The data engine handles locking.  Speed of the data engine itself may be an issue.

NOTE: Due to oplocks, performance testing with one client and oplocks enabled DOES NOT simulate multi-user network performance.  If only one client is being used for performance testing, oplocks should be disabled so that the performance is more realistic.  Instructions for disabling oplocks can be found in Tech Note 95 or 89, found at http://www.niakwa.com/support/


2          Issue: Multiple Sessions on One Machine

Systems that experience this issue have these things in common:

  • Everything runs fast except for file access
  • Significant performance slowdown when second (or more) session is opened
  • Slowdown when using multiple sessions is especially pronounced in 32-bit RunTime, but less so when using the 16-bit RunTime

2.1       Cause

If the program has any type of module that is performing a task continuously, system resources can be drained to the point of noticeably slowing the application.  This effect becomes more obvious if more than one instance of the application is executing on the same workstation.

Perhaps some background information maybe helpful to clear up this mystery.  The following is an extremely simplified example.  A program sends commands or “instructions” to the CPU to be executed.  Typically a program will pause waiting for disk input or output or a return code from some operation.   Under Windows 95 and up, the CPU will execute instructions from a 32-bit application until there is a pause, when it will start executing instructions from another process.  A continuously running process can reduce system resources greatly. 

What is a continuously running process?

One example of a continuously running process would be one where a main menu has a digital clock on it that displays the time.  The background process loops through, getting the time from the system, formatting the input, and printing the hours, minutes, and seconds to the screen.  This type of program will use up all available system resources, which becomes obvious if two sessions are executed on the same workstation or some other Windows task is attempted.  The slowdown comes from the fact that code for the clock on the main menu continuously updates the screen and doesn’t stop for a pause. 

A continuously running process will use 100% of available CPU resources, even if it is in background, forcing other processes to fight for resources.

NOTE: A “polling keyin” is a continuously running process!

NOTE: Why didn’t happen under the old 16-bit Windows RunTime?  In Windows, 16-bit applications execute for a limited period of time (a time slice) and then the processor will pause the application and execute another task for the next time slice.  That way all processes are handled on a rotating basis.

2.2       Possible Solutions

The digital clock display only changes once per second.  The simple solution is to build a one second pause into the timing loop to free the processor to handle other operations.  To do this, add:

:   SELECT P6<
:   PRINT<
:   SELECT P<

to the code right after printing the time to the screen.  That will pause the execution of that loop for one second.

I don’t have a digital clock in my application.

This was just an example of a program that would use up system resources and slow the application. 

The point is, look for code, modules, functions and procedures that execute continuously but may not be accomplishing much that is productive.  Change this code so that it no longer uses 100% of available resources.


3          Issue: Windows 98/ME Scheduling

Systems that experience this issue have these things in common:

  • Everything runs fast except for file access
  • Client is Windows 98/ME
  • File is located on a Novell server or on an NT server with oplocks disabled or broken
  • File locking is implied, not explicit ($OPEN)

3.1       Cause

Windows 98/ME is not as adept at handling scheduling as NT Workstation / 2000 Pro.  Specifically, processes are not prioritized as well.   When a process makes a request from a server, if it does not get an immediate response, but rather gets a delayed response, Windows 98/ME may not immediately switch back to the process when that response is received.

Example:

We have been able to replicate his problem using:

(1) Win98
(2) A file located on a Novell server, either 4.12 or 5.0 or on an NT server with oplocks disabled or broken.
(3) Novell Client for Win9x 3.3 or Microsoft Client for Novell 

The test program we used did 1000 reads on a file.  The program referred to as the time waster did not do any disk access at all.

Access to the file using a test program is reasonable, unless you have another process (we used a program that wasted time in a continuous loop, running in RTIWIN32, it could be any other program, non-NPL included) that's eating lots of CPU time.   In this example, you might expect the test program to slow down by a factor of 2 as CPU time is split between the test program and the time waster.  That's what happens if the OS is NT, or if the file is located on an NT server using oplocks.  But if the file is on a Novell server, or an NT server where oplocks are disabled or broken, the test program slows to a crawl.

Apparently the problem is in the time it takes to obtain the (implied) file lock for the duration of the individual disk access.  If the test program is modified to do a $OPEN before going into the loop, the time returns to a reasonable value.

Looking at the Novell monitor program, the test program is issuing 1000 lock and 1000 unlock calls.

The theory is that when the test program issues a lock (or unlock?) call, the process gets suspended until the server replies granting (or denying) the lock.   If no other process is running on the desktop, when the reply is received the test process resumes immediately.  But if other processes like the time waster are running (with the same priority as the test process) apparently Win98 decides that there's no big rush to switch back to the test process.  It does switch back eventually (within, say 50 milliseconds), but the cumulative delay adds up.  In the test program 1000* 50 milliseconds could be about 50 seconds of total elapsed time.

Interestingly, since only one process had the test file open, we would have expected the use of opportunistic file locking to make it totally unnecessary to even issue the lock and unlock calls to the Novell server, but apparently neither the Novell Client nor Microsoft’s Netware client implements oplocks to a Novell server.

We also tried the test program accessing a file on an NT server, when another node had the test file open (thus breaking the oplock on the file).  This resulted in a similar slow total time to complete the test program under Win98.

3.2       Possible Solutions

One solution would be to use explicit locks ($OPEN).

Another solution would be to change to Windows NT or 2000 clients.

Alternatively, one could implement the solutions mentioned in 1.   Issue: Oplocks on Windows NT / 2000 Networks, so that a higher percentage of file requests are answered immediately.

One additional comment:  The problem may not appear when using the 16-bit version of RTIWIN.  Why?  Because all 16-bit applications run as part of the same process.  When a lock query (or any other I/O operation) is issued to the server from a 16-bit application, all 16-bit programs are suspended until the server replies, and there is no opportunity to switch to another application.


4          Issue: Unnecessary $CLOSE Operations

4.1       Cause

Code that uses a 'general' $CLOSE (without a file number) statement quite a lot should be changed to close a specific device address where possible.   Otherwise ALL opened files are closed, and some will need to be reopened unnecessarily.  One developer observed a 100 times performance difference in his application between using general vs. specific $CLOSE statements.

Also, the ‘general’ $CLOSE should almost never be used by a robust commercial application in a subroutine, since it potentially unlocks files that are totally unrelated to the task at hand.  Which is to say it is imprecise and potentially a bug generator.  For example:

         $OPEN #3
         GOSUB MyRecordOperation
         ;do stuff assuming #3 is still locked
         $CLOSE #3
         STOP
 
         =MyRecordOperation
         $OPEN #4                         :;lock my file
         .... do I/O on #4
         $CLOSE#4                         :;correct - unlocks my file only
         ;$CLOSE                          :;incorrect - unlocks my file and (in this case) unlocks #3 also
         RETURN

After the MyRecordOperation subroutine returns, any main line program disk operations will invoke otherwise unnecessary implied file locking logic in the RunTime.

4.2       Possible Solution

Wherever possible replace the ‘general’ $CLOSE (without a file number) with specific $CLOSE (with a file number) statements.  This will minimize time-consuming file locking activity.


5          Windows Terminal Server / Citrix Performance Considerations

5.1       Location of Files

Arrange for the files to be located on the machine where NPL is running.  Oplocks should not be an issue.

We set up a Windows Terminal Server, and tried a sample program that read 130000 sequential BA records that were located on the system.  Even running multiple sessions on a WTS server, accessing a file local to the server didn’t produce an unexpected performance problem.

If possible, always arrange for the NPL files to be local on the same physical machine as the WTS sessions are running.  That will help to reduce some performance problems.

5.2       Reduce Drive Mapping

It's very important to not use unnecessary drive mappings or server names in the $DEVICE table used by the NPL program.  So for example, if the data file is in:

C:\apps\Datafiles\NPL\Test.BS2

on the machine called Server, and they have also set up a share on the server:

Share C:\apps\Datafiles\NPL as NPL

and mapped a drive to this, e.g. R: = \\SERVER\NPL

then theoretically you could use any of the following names to access the file from the server or WTS Sessions:

C:\apps\Datafiles\NPL\Test.BS2

\\SERVER\NPL\Test.BS2

R:TEST.BS2

Only the first one will give good performance!


6          Other Performance Issues

We have previously run into a known performance issue with RTIWIN32 (compared to 16-bit versions) when many files are being opened, caused by a previous bug correction (BR1122).  The performance issue doesn't normally cause a problem unless many files are being opened (and very little I/O is being done to each).  The cumulative effect on many files can be significant.

As a general rule, using the file/application server as a client, and thereby running a user application on it, tremendously wastes server resources and is considered bad networking practice.  Doing this results in severe performance slowdown for the clients.

Related to the above point, screen savers which require any server resources should not be used.  Almost any screen saver running on the server itself would use server resources, unless it simply makes the monitor go blank or turn off.

 

blank.gif (841 bytes)


Niakwa, Basic-2C and NPL are trademarks of Niakwa, Inc. All other products mentioned are registered trademarks or trademarks of their respective companies.

Questions or problems regarding this web site should be directed to webmaster@niakwa.com.
Copyright 1996-2010 Niakwa, Inc. All rights reserved.
Last updated: Thursday January 07, 2010.