Guides/DLLs/Calling DLLs

From J Wiki
Jump to navigation Jump to search
Calling DLLs | Error Messages | Memory Management | Calling the J DLL

Calling Procedures in DLLs

Calling procedures incorrectly can crash your system or corrupt memory.

J can call procedures in a shared library file. In Windows these files are called DLLs (dynamic link libraries) and have a .dll extension. In Linux they are called shared libraries or shared objects and have a .so extension. In OSX they are called dynamic libraries and have a .dylib extension. Here, the term DLL is used for all platforms. A procedure in a DLL is called by its name and filename.

J can also call a procedure in memory directly with its address.

J can also call a procedure in an object with the object address and the procedure index.

Calling procedures from J is the same across platforms.

Recommended reading: This page provides more information and examples: https://www.jsoftware.com/help/jforc/calling_external_programs.htm

Verb cd

Verb cd calls a procedure. The form is:

'filename procedure [>][+][%] declaration' cd parameters

filename

The filename is usually the name of a DLL. The search path for finding a filename that is not fully qualified involves many directories and is different on each platform. Except for system DLLs, a fully qualified filename is recommended.

A filename of 0 indicates that the procedure is a memory address (J display of a J signed integer with _ as the negative sign).

A filename of 1 indicates that the procedure is a non-negative integer that is the index of a procedure in a vtable (list of procedure addresses). The first parameter is the object address and is the address of the address of the vtable. The declaration of the first parameter must be * or x .

procedure

The procedure is usually the case-sensitive name of the procedure in the DLL to call. A Windows procedure can be identified by a number specified as # followed by digits. Win32 API procedures that take string parameters are documented under a name, but are implemented under the name with an A suffix for 8 bit characters and a W suffix for 16 bit characters. For example, CreateFile is documented, but the procedures you call are CreateFileA or CreateFileW. If filename is 0 or 1, the procedure is treated as discussed in the filename section.

A procedure returns a scalar result and takes 0 or more parameters. Parameters are passed by value or by a pointer to values. Pointer parameters can be read and set.

> option returns a scalar result, where possible. Without > the result is the boxed scalar result followed by the boxed arguments.

+ option selects the alternate calling convention. The calling convention is the rules for how the result and parameters are passed between the caller and the procedure. Using the wrong one can crash or corrupt memory. On Windows, the standard is __stdcall and + selects __cdecl. On Unix + currently has no effect, but should be avoided as this may change in future releases.

% option does an fpreset (floating-point state reset) after the call. Some procedures leave floating-point in an invalid state that causes a crash at some later time.

declaration

The declaration is a set of blank delimited codes describing result and parameter types:

code parameter type
b J literal1 (1 byte) introduced in j901
c J character (1 byte)
w J literal2 (2 byte)
u J literal4 (4 byte)
s short integer (2 byte)
i integer (4 byte)
l long integer (8 byte)
x J integer (4 byte or 8 byte depending on 32/64)
f short float (4 byte)
d J float (8 byte)
j or z J complex (16 byte - 2 d values) (pointer only, not as result or scalar)
* pointer to memory
& pointer to constant memory
n no result (result, if any, is ignored and 0 is returned)

The first declaration type describes the result and the remaining ones describe the parameters in the cd right argument. Arguments are converted as required to conform with their declaration type.

The c w u x d j types are native J types. In J32 (32 bit) the i type is the same as x and the l type is an error. In J64 (64bit) the l type is the same as x and the i type is handled similarly to s.

A scalar type (c w u s i l x f d) must have a scalar parameter of the right type. Scalar s i l values take a J integer and f values take a J float.

The * type is a pointer to values. A * can be followed by c w u s i l x f d or j to indicate the type of values. The DLL can read or write this memory. The memory is unaliased.

The & type is same as * except the memory is not unaliased as it is assumed to be unmodified by the DLL.

A pointer type parameter must be an array with rank > 1 of the right type, or a scalar boxed scalar integer that is a memory address.

A J integer list used for *s is converted in place to shorts before calling the dll and is converted back after the call. A J integer list used for *i on J64 is converted in place to 4 bytes and then back. A J float list used for a *f is converted in place to short floats and then back.

A *s and *f also allow a character list with a count that is evenly divisible respectively by 2 and 4.

J does not have unsigned types so DLL unsigned type values with the top bit on are returned to J as negative values.

The mema result (Memory Management) can be used as a * type parameter. A memory address parameter is a boxed scalar. The NULL pointer is <0.

The cd right argument is a list of enclosed parameters. If the argument is not boxed, it is treated as if <"0 were applied.

cd is rank 1. Its result, without the > option, is the procedure result catenated with its possibly modified right argument. With the > option the result is the procedure scalar result.

Examples

For example, the Win32 API procedure GetProfileString in kernel32 gets the value of the windows/device keyword.

   a=: 'kernel32 GetProfileStringA s *c *c *c *c s'
   b=: 'windows';'device'; 'default'; (32$'z');32
   a cd b
┌──┬───────┬──────┬───────┬────────────────────────────────┬──┐
│31│windows│device│default│Microsoft Print to PDF,winspool │32│
└──┴───────┴──────┴───────┴────────────────────────────────┴──┘

The first type s indicates that the procedure returns a short integer. The first pointer names a section. The second pointer names a keyword. The third pointer is the default if the keyword is not found. The fourth parameter is where the keyword text is put. The fifth parameter is the length of the fourth parameter.

If the GetProfileStringA declaration was wrong, say a d result instead of s, it would crash your system. If the fifth parameter was 64 and the keyword was longer than the 32 characters allocated by the fourth parameter, the extra data would overwrite memory.

Procedures are usually documented with a C prototype, for example the prototype for GetProfileString is:

DWORD WINAPI GetProfileString(
  _In_  LPCTSTR lpAppName,
  _In_  LPCTSTR lpKeyName,
  _In_  LPCTSTR lpDefault,
  _Out_ LPTSTR  lpReturnedString,
  _In_  DWORD   nSize
);

J declaration types and some commonly used names are:

J type other names
c char, byte, bool
s short, short int, word, %
i int, dword
f float, !
d double, #
*c char*, int*, LP..., void*, $
n void

cdf'' unloads all DLLs that cd has loaded. A loaded DLL is in use and attempts to modify it will fail. If you are developing and testing a DLL you must run cdf'' before you can build and save a new version.

J807 Incompatibilities

JE (J engine) is getting smarter and some changes conflict with old cd assumptions. JE is more aggressive about not copying data and instead having multiple users of the same memory. A noun can even use memory in the middle of another noun! This allows better performance in a smaller memory footprint, sometimes significantly so.

Previous memory management meant that cd changes generally worked as expected, especially in simpler cases. In J807 a cd that changes J memory avoids 'coherence management' and some code changes may be required.

A cd now ensures that memory that can be changed is 'unaliased' so that changes only affect the cd result. The unalias is a forced copy if required and is done with memu (15!:15).

This avoids some old strange behavior such as:

  a=. b=. i.5
  ... cd ...;a;... NB. which would have changed both a and b.

Critically it stops nasty problems where changing an arg could have changed more nouns than expected because of shared memory.

A coding trick that worked will no longer work:

  ... cd ...;(a=.,_1);...
  ... a ... NB. a is still ,_1 and is not the changes!

The arg is unaliased so only the cd result is changed:

  r=. ... cd ...;(,_1);...
  a=. >n{r NB. changes

In some cases the arg is known to be constant and an unaliased copy is unnecessary overhead. A declaration with & instead of * indicates a constant and cd skips the unalias.

Summary of coding changes required:

  • get changes from the cd arg
  • & instead of * avoids unnecessary copy of constant