

The import format describes data that may have been prepared by hand, and is intentionally lax a program that can read PGN data should be able to handle the somewhat lax import format. There are two formats in the PGN specification, the "import" format and the "export" format. PGN is structured "for easy reading and writing by human users and for easy parsing and generation by computer programs." The chess moves themselves are given in algebraic chess notation using English initials for the pieces. Edwards, and was first popularized and specified via the Usenet newsgroup.

See This is a very powerful tool.PGN was devised around 1993, by Steven J. For example, create a 'tagfile', and write the following in it:Īnd you will get 500 files by ECO code, each containing only games in which both players have Elo at least 2400. You can also apply other filters like minium rating etc. "pgn-extract -E3 *.pgn" will create 500 files A00.pgn to E99.pgn. But instead of this, you can simply ignore at most 999 games if you have split the file in 1000 parts.Īfter doing all this, you can use pgn-extract to sort the contents of all these files into ECO codes. If you manage to run the split command, then the rest will be easy with small scripts (using e.g., bash, sed, awk, perl or python).

If so, look at the head of the next file 38.pgn, and get the remaining part of the game from 38.pgn and add it at the end of 37.pgn and remove it from the beginning of 38.pgn. For example, you can check the tail of each file, say file 37, to see if it has been split in the middle of a game. Maybe you can fix this issue manually after the split. The split may happen in the middle of a game. I hope you won't face memory issues in using this tool. It is also possible to specify the number of bytes in each file. Check "man split" to see what other options split offers. But I think zip itself allows the split option, so any other modern zip/unzip tool would have the option. The simplest usage will be something like the following (in a terminal), which will split a file into 1000 partsĭo this after extracting. You can perhaps use the linux split command to split the file into as many files as you want.
