Mercurial > jhg
annotate design.txt @ 709:497e697636fc
Report merged lines as changed block if possible, not as a sequence of added/deleted blocks. To facilitate access to merge parent lines AddBlock got mergeLineAt() method that reports index of the line in the second parent (if any), while insertedAt() has been changed to report index in the first parent always
| author | Artem Tikhomirov <tikhomirov.artem@gmail.com> | 
|---|---|
| date | Wed, 21 Aug 2013 16:23:27 +0200 | 
| parents | 31a89587eb04 | 
| children | 
| rev | line source | 
|---|---|
| 1 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 1 FileStructureWalker (pass HgFile, HgFolder to callable; which can ask for VCS data from any file) | 
| 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 2 External uses: user browses files, selects one and asks for its history | 
| 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 3 Params: tip/revision; | 
| 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 4 Implementation: manifest | 
| 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 5 | 
| 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 6 Log --rev | 
| 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 7 Log <file> | 
| 2 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 8 HgDataFile.history() or Changelog.history(file)? | 
| 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 9 | 
| 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 10 | 
| 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 11 Changelog.all() to return list with placeholder, not-parsed elements (i.e. read only compressedLen field and skip to next record), so that | 
| 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 12 total number of elements in the list is correct | 
| 1 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 13 | 
| 
a3576694a4d1
Repository detection from local/specified directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: diff
changeset | 14 hg cat | 
| 2 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 15 Implementation: logic to find file by name in the repository is the same with Log and other commands | 
| 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 16 | 
| 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 17 | 
| 
08db726a0fb7
Shaping out low-level Hg structures
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
1diff
changeset | 18 Revlog | 
| 4 
aa1912c70b36
Fix offset issue for inline revlogs. Commandline processing.
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
2diff
changeset | 19 What happens when big entry is added to a file - when it detects it can't longer fit into .i and needs .d? Inline flag and .i format changes? | 
| 
aa1912c70b36
Fix offset issue for inline revlogs. Commandline processing.
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
2diff
changeset | 20 | 
| 22 
603806cd2dc6
Status of local working dir against non-tip base revision
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
21diff
changeset | 21 What's hg natural way to see nodeids of specific files (i.e. when I do 'hg --debug manifest -r 11' and see nodeid of some file, and | 
| 
603806cd2dc6
Status of local working dir against non-tip base revision
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
21diff
changeset | 22 then would like to see what changeset this file came from)? | 
| 4 
aa1912c70b36
Fix offset issue for inline revlogs. Commandline processing.
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
2diff
changeset | 23 | 
| 
aa1912c70b36
Fix offset issue for inline revlogs. Commandline processing.
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
2diff
changeset | 24 ---------- | 
| 6 
5abe5af181bd
Ant script to build commands and run sample
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
5diff
changeset | 25 + support patch from baseRev + few deltas (although done in a way patches are applied one by one instead of accumulated) | 
| 4 
aa1912c70b36
Fix offset issue for inline revlogs. Commandline processing.
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
2diff
changeset | 26 + command-line samples (-R, filenames) (Log & Cat) to show on any repo | 
| 6 
5abe5af181bd
Ant script to build commands and run sample
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
5diff
changeset | 27 +buildfile + run samples | 
| 9 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 28 *input stream impl + lifecycle. Step forward with FileChannel and ByteBuffer, although questionable accomplishment (looks bit complicated, cumbersome) | 
| 14 
442dc6ee647b
Show correct time
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
11diff
changeset | 29 + dirstate.mtime | 
| 43 
1b26247d7367
Calculate result length of the patch operarion, when unknown
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
41diff
changeset | 30 +calculate sha1 digest for file to see I can deal with nodeid. +Do this correctly (smaller nodeid - first) | 
| 18 
02ee376bee79
status operation against current working directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
17diff
changeset | 31 *.hgignored processing | 
| 25 
da8ccbfae64d
Reflect Nodeid's array is exactly 20
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
22diff
changeset | 32 +Nodeid to keep 20 bytes always, Revlog.Inspector to get nodeid array of meaningful data exact size (nor heading 00 bytes, nor 12 extra bytes from the spec) | 
| 26 
71a9ba42cee8
Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
25diff
changeset | 33 +DataAccess - implement memory mapped files, | 
| 49 
26e3eeaa3962
branch and user filtering for log operation
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
43diff
changeset | 34 +Changeset to get index (local revision number) | 
| 60 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 35 +RevisionWalker (on manifest) and WorkingCopyWalker (io.File) talking to ? and/or dirstate (StatusCollector and WCSC) | 
| 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 36 +RevlogStream - Inflater. Perhaps, InflaterStream instead? branch:wrap-data-access | 
| 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 37 +repo.status - use same collector class twice, difference as external code. add external walker that keeps collected maps and use it in Log operation to give files+,files- | 
| 78 
c25c5c348d1b
Skip metadata in the beginning of a file content. Parse metadata, recognize copies/renames
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
64diff
changeset | 38 + strip \1\n metadata out from RevlogStream | 
| 84 
08754fce5778
updated design questions
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
78diff
changeset | 39 + hash/digest long names for fncache | 
| 169 
8c8e3f372fa1
Towards initial clone: refactor HgBundle to provide slightly higher-level structure of the bundle
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
136diff
changeset | 40 +Strip off metadata from beg of the stream - DataAccess (with rebase/moveBaseOffset(int)) would be handy | 
| 
8c8e3f372fa1
Towards initial clone: refactor HgBundle to provide slightly higher-level structure of the bundle
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
136diff
changeset | 41 + hg status, compare revision and local file with kw expansion and eol extension | 
| 
8c8e3f372fa1
Towards initial clone: refactor HgBundle to provide slightly higher-level structure of the bundle
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
136diff
changeset | 42 | 
| 
8c8e3f372fa1
Towards initial clone: refactor HgBundle to provide slightly higher-level structure of the bundle
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
136diff
changeset | 43 write code to convert inlined revlog to .i and .d | 
| 60 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 44 | 
| 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 45 delta merge | 
| 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 46 DataAccess - collect debug info (buffer misses, file size/total read operations) to find out better strategy to buffer size detection. Compare performance. | 
| 396 
0ae53c32ecef
Straighten out exceptions thrown when file access failed - three is too much
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
367diff
changeset | 47 RevlogStream - inflater buffer (and other buffers) size may be too small for repositories out there (i.e. inflater buffer of 512 bytes for 200k revision) | 
| 41 
858d1b2458cb
Check integrity for bundle changelog. Sort nodeids when calculating hash
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
33diff
changeset | 48 | 
| 169 
8c8e3f372fa1
Towards initial clone: refactor HgBundle to provide slightly higher-level structure of the bundle
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
136diff
changeset | 49 | 
| 128 
44b97930570c
Introduced ChangelogHelper to look up changesets files were modified in
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
93diff
changeset | 50 Parameterize StatusCollector to produce copy only when needed. And HgDataFile.metadata perhaps should be moved to cacheable place? | 
| 
44b97930570c
Introduced ChangelogHelper to look up changesets files were modified in
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
93diff
changeset | 51 | 
| 18 
02ee376bee79
status operation against current working directory
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
17diff
changeset | 52 Status operation from GUI - guess, usually on a file/subfolder, hence API should allow for starting path (unlike cmdline, seems useless to implement include/exclide patterns - GUI users hardly enter them, ever) | 
| 60 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 53 -> recently introduced FileWalker may perhaps help solving this (if starts walking from selected folder) for status op against WorkingDir? | 
| 9 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 54 | 
| 84 
08754fce5778
updated design questions
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
78diff
changeset | 55 ? Can I use fncache (names from it - perhaps, would help for Mac issues Alex mentioned) | 
| 
08754fce5778
updated design questions
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
78diff
changeset | 56 ? Does fncache lists both .i and .d (iow, is it true hashed <long name>.d is different from hashed <long name>.i) | 
| 
08754fce5778
updated design questions
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
78diff
changeset | 57 | 
| 15 
865bf07f381f
Basic hgignore handling
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
14diff
changeset | 58 ??? encodings of fncache, .hgignore, dirstate | 
| 16 
254078595653
Print manifest nodeid
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
15diff
changeset | 59 ??? http://mercurial.selenic.com/wiki/Manifest says "Multiple changesets may refer to the same manifest revision". To me, each changeset | 
| 
254078595653
Print manifest nodeid
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
15diff
changeset | 60 changes repository, hence manifest should update nodeids of the files it lists, effectively creating new manifest revision. | 
| 15 
865bf07f381f
Basic hgignore handling
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
14diff
changeset | 61 | 
| 64 
19e9e220bf68
Convenient commands constitute hi-level API. org.tmatesoft namespace, GPL2 statement
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
61diff
changeset | 62 ? subrepos in log, status (-S) and manifest commands | 
| 
19e9e220bf68
Convenient commands constitute hi-level API. org.tmatesoft namespace, GPL2 statement
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
61diff
changeset | 63 | 
| 197 
3a7696fb457c
Investigate optimization options to allow fast processing of huge repositories. Fix defect in StatusCollector that lead to wrong result comparing first revision to empty repo (-1 to 0), due to same TIP constant value
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
169diff
changeset | 64 ? when p1 == -1, and p2 != -1, does HgStatusCollector.change() give correct result? | 
| 93 
d55d4eedfc57
Switch to Path instead of String in filenames returned by various status operations
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
84diff
changeset | 65 | 
| 64 
19e9e220bf68
Convenient commands constitute hi-level API. org.tmatesoft namespace, GPL2 statement
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
61diff
changeset | 66 Commands to get CommandContext where they may share various caches (e.g. StatusCollector) | 
| 93 
d55d4eedfc57
Switch to Path instead of String in filenames returned by various status operations
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
84diff
changeset | 67 Perhaps, abstract classes for all Inspectors (i.e. StatusCollector.Inspector) for users to use as base classes to protect from change? | 
| 64 
19e9e220bf68
Convenient commands constitute hi-level API. org.tmatesoft namespace, GPL2 statement
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
61diff
changeset | 68 | 
| 205 
ffc5f6d59f7e
HgLogCommand.Handler is used in few places, pull up to top-level class, HgChangesetHandler
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
202diff
changeset | 69 -cancellation and progress support | 
| 
ffc5f6d59f7e
HgLogCommand.Handler is used in few places, pull up to top-level class, HgChangesetHandler
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
202diff
changeset | 70 -timestamp check for revlog to recognize external changes | 
| 209 
9ce3b26798c4
Few branches (distinct BranchChains from distinct heads) may end up with same nodes. Building BC structure fixed to reuse chain elements
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
205diff
changeset | 71 -HgDate or any other better access to time info | 
| 
9ce3b26798c4
Few branches (distinct BranchChains from distinct heads) may end up with same nodes. Building BC structure fixed to reuse chain elements
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
205diff
changeset | 72 -(low) RepositoryComparator#calculateMissingBranches may query branches for the same head more than once | 
| 
9ce3b26798c4
Few branches (distinct BranchChains from distinct heads) may end up with same nodes. Building BC structure fixed to reuse chain elements
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
205diff
changeset | 73 (when there are few heads that end up with common nodes). e.g hg4j revision 7 against remote hg4j revision 206 | 
| 205 
ffc5f6d59f7e
HgLogCommand.Handler is used in few places, pull up to top-level class, HgChangesetHandler
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
202diff
changeset | 74 | 
| 9 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 75 >>>> Effective file read/data access | 
| 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 76 ReadOperation, Revlog does: repo.getFileSystem().run(this.file, new ReadOperation(), long start=0, long end = -1) | 
| 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 77 ReadOperation gets buffer (of whatever size, as decided by FS impl), parses it and then reports if needs more data. | 
| 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 78 This helps to ensure streams are closed after reading, allows caching (if the same file (or LRU) is read few times in sequence) | 
| 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 79 and allows buffer management (i.e. reuse. Single buffer for all reads). | 
| 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 80 Scheduling multiple operations (in future, to deal with writes - single queue for FS operations - no locks?) | 
| 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 81 | 
| 60 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 82 WRITE: Need to register instances that cache files (e.g. dirstate or .hgignore) to FS notifier, so that cache may get cleared if the file changes (i.e. WriteOperation touches it). | 
| 
613c936d74e4
Log operation to output mode detailed (added, removed) files
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
55diff
changeset | 83 | 
| 9 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 84 File access: | 
| 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 85 * NIO and mapped files - should be fast. Although seems to give less control on mem usage. | 
| 21 
e929cecae4e1
Refactor to move revlog content to base class
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
20diff
changeset | 86 * Regular InputStreams and chunked stream on top - allocate List<byte[]>, each (but last) chunk of fixed size (depending on initial file size) | 
| 9 
d6d2a630f4a6
Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
6diff
changeset | 87 | 
| 129 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 88 | 
| 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 89 * API | 
| 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 90 + rename in .core Cset -> HgChangeset, | 
| 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 91 + rename in .repo Changeset to HgChangelog.Changeset, Changeset.Inspector -> HgChangelog.Inspector | 
| 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 92 - CommandContext | 
| 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 93 - Data access - not bytes, but ByteChannel | 
| 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 94 - HgRepository constants (TIP, BAD, WC) to HgRevisions enum | 
| 131 
aa1629f36482
Renamed .core classes to start with Hg prefix
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
129diff
changeset | 95 - RevisionMap to replace TreeMap<Integer, ?> | 
| 
aa1629f36482
Renamed .core classes to start with Hg prefix
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
129diff
changeset | 96 + .core.* rename to Hg* | 
| 
aa1629f36482
Renamed .core classes to start with Hg prefix
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
129diff
changeset | 97 + RepositoryTreeWalker to ManifestCommand to match other command classes | 
| 
aa1629f36482
Renamed .core classes to start with Hg prefix
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
129diff
changeset | 98 | 
| 
aa1629f36482
Renamed .core classes to start with Hg prefix
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
129diff
changeset | 99 * defects | 
| 136 
947bf231acbb
Strip off comments in config file
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
131diff
changeset | 100 + ConfigFile to strip comments from values (#) | 
| 129 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 101 | 
| 26 
71a9ba42cee8
Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
25diff
changeset | 102 <<<<< | 
| 197 
3a7696fb457c
Investigate optimization options to allow fast processing of huge repositories. Fix defect in StatusCollector that lead to wrong result comparing first revision to empty repo (-1 to 0), due to same TIP constant value
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
169diff
changeset | 103 Performance. | 
| 
3a7696fb457c
Investigate optimization options to allow fast processing of huge repositories. Fix defect in StatusCollector that lead to wrong result comparing first revision to empty repo (-1 to 0), due to same TIP constant value
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
169diff
changeset | 104 after pooling/caching in HgStatusCollector and HgChangeset | 
| 
3a7696fb457c
Investigate optimization options to allow fast processing of huge repositories. Fix defect in StatusCollector that lead to wrong result comparing first revision to empty repo (-1 to 0), due to same TIP constant value
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
169diff
changeset | 105 hg log --debug -r 0:5000 and same via Log/HgLogCommand: approx. 220 seconds vs 279 seconds. Mem. cons. 20 vs 80 mb. | 
| 
3a7696fb457c
Investigate optimization options to allow fast processing of huge repositories. Fix defect in StatusCollector that lead to wrong result comparing first revision to empty repo (-1 to 0), due to same TIP constant value
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
169diff
changeset | 106 after further changes in HgStatusCollector (to read ahead 5 elements, 50 max cache, fixed bug with -1) - hg4j dumps 5000 in | 
| 
3a7696fb457c
Investigate optimization options to allow fast processing of huge repositories. Fix defect in StatusCollector that lead to wrong result comparing first revision to empty repo (-1 to 0), due to same TIP constant value
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
169diff
changeset | 107 93 seconds, memory consumption about 50-56 Mb | 
| 
3a7696fb457c
Investigate optimization options to allow fast processing of huge repositories. Fix defect in StatusCollector that lead to wrong result comparing first revision to empty repo (-1 to 0), due to same TIP constant value
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
169diff
changeset | 108 | 
| 198 
33a7d76f067b
Performance optimization: reduce memory to keep revlog cached info
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
197diff
changeset | 109 IndexEntry(int offset, int baseRevision) got replaced with int[] arrays (offsets - optional) | 
| 
33a7d76f067b
Performance optimization: reduce memory to keep revlog cached info
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
197diff
changeset | 110 for 69338 revisions from cpython repo 1109408 bytes reduced to 277368 bytes with the new int[] version. | 
| 
33a7d76f067b
Performance optimization: reduce memory to keep revlog cached info
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
197diff
changeset | 111 I.e. total for changelog+manifest is 1,5 Mb+ gain | 
| 
33a7d76f067b
Performance optimization: reduce memory to keep revlog cached info
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
197diff
changeset | 112 | 
| 200 
114c9fe7b643
Performance optimization: reduce memory ParentWalker hogs
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
198diff
changeset | 113 ParentWalker got arrays (Nodeid[] and int[]) instead of HashMap/LinkedHashSet. This change saves, per revision: | 
| 
114c9fe7b643
Performance optimization: reduce memory ParentWalker hogs
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
198diff
changeset | 114 was: LinkedHashSet$Entry:32 + HashMap$Entry:24 + HashMap.entries[]:4 (in fact, up to 8, given entries size is power of 2, and 69000+ | 
| 
114c9fe7b643
Performance optimization: reduce memory ParentWalker hogs
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
198diff
changeset | 115 elements in cpython test repo resulted in entries[131072]. | 
| 
114c9fe7b643
Performance optimization: reduce memory ParentWalker hogs
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
198diff
changeset | 116 total: (2 HashMaps) 32+(24+4)*2 = 88 bytes | 
| 
114c9fe7b643
Performance optimization: reduce memory ParentWalker hogs
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
198diff
changeset | 117 now: Nodeid[]:4 , int[]:4 bytes per entry. arrays of exact revlog size | 
| 
114c9fe7b643
Performance optimization: reduce memory ParentWalker hogs
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
198diff
changeset | 118 total: (4 Nodeid[], 1 int[]) 4*4 + 4 = 20 bytes | 
| 
114c9fe7b643
Performance optimization: reduce memory ParentWalker hogs
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
198diff
changeset | 119 for cpython test repo with 69338 revisions, 1 387 224 instead of 4 931 512 bytes. Mem usage (TaskManager) ~50 Mb when 10000 revs read | 
| 
114c9fe7b643
Performance optimization: reduce memory ParentWalker hogs
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
198diff
changeset | 120 | 
| 197 
3a7696fb457c
Investigate optimization options to allow fast processing of huge repositories. Fix defect in StatusCollector that lead to wrong result comparing first revision to empty repo (-1 to 0), due to same TIP constant value
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
169diff
changeset | 121 <<<<< | 
| 26 
71a9ba42cee8
Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
25diff
changeset | 122 | 
| 
71a9ba42cee8
Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
25diff
changeset | 123 Tests: | 
| 61 
fac8e7fcc8b0
Simple test framework - capable of parsing Hg cmdline output to compare with Java result
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
60diff
changeset | 124 DataAccess - readBytes(length > memBufferSize, length*2 > memBufferSize) - to check impl is capable to read huge chunks of data, regardless of own buffer size | 
| 
fac8e7fcc8b0
Simple test framework - capable of parsing Hg cmdline output to compare with Java result
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
60diff
changeset | 125 | 
| 129 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 126 ExecHelper('cmd', OutputParser()).run(). StatusOutputParser, LogOutputParser extends OutputParser. construct java result similar to that of cmd, compare results | 
| 
645829962785
core.Cset renamed to HgChangeset; repo.Changeset moved into HgChangelog
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
128diff
changeset | 127 | 
| 202 
706bcc7cfee4
Basic test for HgIncomingCommand. Fix RepositoryComparator for cases when whole repository is unknown. Respect freshly initialized (empty) repositories in general.
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
200diff
changeset | 128 Need better MethodRule than ErrorCollector for tests run as java app (to print not only MultipleFailureException, but distinct errors) | 
| 
706bcc7cfee4
Basic test for HgIncomingCommand. Fix RepositoryComparator for cases when whole repository is unknown. Respect freshly initialized (empty) repositories in general.
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
200diff
changeset | 129 Also consider using ExternalResource and TemporaryFolder rules. | 
| 367 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 130 | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 131 | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 132 ================= | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 133 Naming: | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 134 nodeid: revision | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 135 int: revisionIndex (alternatives: revisionNumber, localRevisionNumber) | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 136 BUT, if class name bears Revision, may use 'index' and 'nodeid' | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 137 NOT nodeid because although fileNodeid and changesetNodeid are ok (less to my likening than fileRevision, however), it's not clear how | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 138 to name integer counterpart, just 'index' is unclear, need to denote nodeid and index are related. 'nodeidIndex' would be odd. | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 139 Unfortunately, Revision would be a nice name for a class <int, Nodeid>. As long as I don't want to keep methods to access int/nodeid separately | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 140 and not to stick to Revision struct only (to avoid massive instances of Revision<int,Nodeid> when only one is sufficient), I'll need to name | 
| 
2fadf8695f8a
Use 'revision index' instead of the vague 'local revision number' concept in the API
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
209diff
changeset | 141 these separate methods anyway. Present opinion is that I don't need the object right now (will have to live with RevisionObject or RevisionDescriptor | 
| 427 
31a89587eb04
FIXMEs: consistent names, throws for commands and their handlers. Use of checked exceptions in hi-level api
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
396diff
changeset | 142 once change my mind) | 
| 
31a89587eb04
FIXMEs: consistent names, throws for commands and their handlers. Use of checked exceptions in hi-level api
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
396diff
changeset | 143 | 
| 
31a89587eb04
FIXMEs: consistent names, throws for commands and their handlers. Use of checked exceptions in hi-level api
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
396diff
changeset | 144 Handlers (HgStatusHandler, HgManifestHandler, HgChangesetHandler, HgChangesetTreeHandler) | 
| 
31a89587eb04
FIXMEs: consistent names, throws for commands and their handlers. Use of checked exceptions in hi-level api
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
396diff
changeset | 145 methods DO NOT throw CancelledException. cancellation is separate from processing logic. handlers can implements CancelSupport to become a source of cancellation, if necessary | 
| 
31a89587eb04
FIXMEs: consistent names, throws for commands and their handlers. Use of checked exceptions in hi-level api
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
396diff
changeset | 146 methods DO throw HgCallbackTargetException to propagate own errors/exceptions | 
| 
31a89587eb04
FIXMEs: consistent names, throws for commands and their handlers. Use of checked exceptions in hi-level api
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
396diff
changeset | 147 methods are supposed to silently pass HgRuntimeExceptions (although callback implementers may decide to wrap them into HgCallbackTargetException) | 
| 
31a89587eb04
FIXMEs: consistent names, throws for commands and their handlers. Use of checked exceptions in hi-level api
 Artem Tikhomirov <tikhomirov.artem@gmail.com> parents: 
396diff
changeset | 148 descriptive names for the methods, whenever possible (not bare #next) | 
