Jd performance is due to columnar data. A data column (e.g., name or date or license) is a disk file that is mapped to a J noun. A query on a data column only needs the column data in ram to do the query. If query columns fit in ram then queries run at ram speed. Then only data required for the result is read from disk and typically this is a small fraction of the total database size.
Ram is the most critical factor in Jd performance.
In general, performance will be good if available ram more that at least 2 times space required by the cols typically used in a query. A factor of 4 is better if more complicated things such as ref and key are used.
Working with data in ram is orders of magnitude faster than working with data that has to come from an ssd. For a serious application there is no good reason to not use ssd.
Working with data from an ssd is orders of magnitude faster that from a hard disk drive.
Use the smallest intx type that will hold the data. This will reduce overall database size and will make better use of ram.
A partitioned table has column data in multiple files. For the user, a partitioned table is the same as any other table, but it can make a signficant performance difference. High performance queries/inserts/updates/modifies can be achieved on tables with billions of rows on modest hardware if they are partitioned.
allocation across drives
Table column data files can be located, under your control, on different drives. For example, table columns critical for query peformance could be on an ssd drive and the rest of the columns could be stored on normal drives.
This control over drive allocation also works for partitioned tables. For example, column data files for recent dates could be stored on ssd and files for the rest of the data could be stored on normal drives.
A database structure maps directly to a folder structure. A database is a folder, a table is a folder in a database, and a column is a folder in a table.
A db/table/col drop cannot be undone. It would be unfortunate to inadvertently drop something that was hard to recover.
Restrictions while building a database can be a nuisance, but when things are stable it can be nice to disallow drops. This can be done with jdaccess but that is perhaps more mechanism than warranted.
jddropstop provides an easy way to prevent bad drops.
A db/table/col drop uses the utility jddeletefolder. This utility is also used in other admin activities, for example, deleting folders of csv files that have been processed.
A jddeletefolder cannot be undone. It would be unfortunate to inadvertently delete something that was hard to recover.
jddeletefolder allows delete only if certain criteria are met and this can prevent an unintended delete.
Jd4 is incompatible with databases created under previous versions. Jd4 code will not open a Jd3 database and vice versa.
To continue working with databases created with Jd3 you must:
- rename ~addons/data/jd to ~addons/data/jd3
And then, when you want to work with a Jd3 database:
An update will overwrite addons/data/jd with the Jd4 codebase. After that point, update can't give you a jd3 folder. You can create it by downloading the appropriate file from jd3 zips and unpacking in a temp folder, renaming jd to jd3 and then moving it to your addons folder.
See Release for change details.
The Jd4 has conversion tools to migrate a Jd3 database to a Jd4. The conversion is done in place and only works in the one direction.
start J load 'jd' jd 'list version' NB. verify 4.1 load '~addons/data/jd/tools/convert.ijs' dryrun '~temp/jd/test' NB. dry run this db and report details DO NOT conrun UNTIL dryrun IS CLEAN! BACKUP! conrun '~temp/jd/test' NB. convert this db in place and report details
- jdactive col dropped and deleted rows compressed out
- jd... cols (ref/reference/hash/...) dropped
- cols with type time or enum dropped
- jd'ref ...' done for jdref and jdreference cols
- dryrun failure leaves db untouched
- conrun failure probably leaves db damaged - neither jd3 nor jd4 compatible
Complete backup or restore is just a copy of the db file folder. Host shell scripts can provide full backup/restore. With large databases and suitable hardware it might be worthwhile to use multiple tasks and use compression.
CSV dump/restore also provides complete backup.
Jd is distributed with JAL (Package Manager) and the Jd library is at ~addons/data/jd and is accessed with the following equivalent lines:
A developer works with a local repo. Use the development library with something like:
Loading jd.ijs sets JDP_z_ as the path to the Jd library and this is used for all library references.
An automated process copies the developer repo to the addon svn repo to build a new Jd release.
Jd requires lots of file handles. Using thousands of columns requires thousands of handles.
Jd fails badly if it runs out of handles. Unable to access a file, an error is signaled, perhaps in the middle of an operation that will leave the database damaged.
Windows user does not have a limit on file handles.
Linux/Mac user often has low soft and hard limits on handles and this must be increased for serious use of Jd. There is no reason to not raise the limit to 100000.
See the soft and hard limits with:
...$ ulimit -n
If hard limit is high enough, it might be easiest, before starting J, to do:
...$ ulimit -n 100000
To increase file handle limit for Linux Jd user fred: ...$ ulimit -n # show current file handle limit
run superuser text editor and open /etc/security/limits.conf add following 2 lines at the end fred soft nofile 200000 fred hard nofile 200000 save the file, restart system, and verify new ulimit
To increase file handle limit for Mac the steps are similar, but of course different, and details are left to the reader. Yosemite has a low soft limit and a high hard limit.
Folder symbolic links (Windows folder junctions) are used to place db cols on different drives.
- i/o load balancing across drives
- some cols can be placed on ssd drives
- total db size irrelevant - only limit is biggest col must fit on a drive
See Admin jdlinkmove, jdlinkset, and jdlinktargets for details.
See tutorial link.
Jd linux libjd.so shared library will run on most modern linux systems.
If Jd gets an error loading the linux shared library, please report the following to the J database forum:
...$ cat /proc/version ...$ cat /etc/issue ...$ ldd .../libjd.so
locales and db file structure
Parts of a database (tables, cols, data) correspond directly with the file structure. That is, a table is a folder in the database, each col is a folder in its table folder, and data is a file in its col folder.
When a database is opened, J locales are created that correspond to the database structure. Each table has a locale with metadata and each col has a locale with metadata and mapped file(s) with the data.
Sometimes it can be useful to dig into the internals.
jdadminx'test' jd'gen test f 3' jd'reads from f' t=. jdgl_jd_'f' NB. get locale for table f NAME__t NB. table name NAMES__t NB. col names in table c=. jdgl_jd_'f x' NB. get locale for col x in table f typ__c NB. column type PATH__c NB. path to col dat file dat__c NB. mapped file data
See pmhelp_jd_ for info.
Folder pm has scripts for performance measurement.
Windows search service
Windows Search Service (content indexing, ...) can cause lots of disk activity and can interfere with Jd file operations and if possible should be disabled when using Jd.
Disable Windows Search Service as follows:
- command prompt ...>services.msc
- scroll down and right click Windows Search
- click Properties
- click Stop button to stop service if it is running
- change Startup type: to Disabled
- click Apply