@@ -1685,3 +1685,49 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
16851685	clear_survey_context (& ctx );
16861686	return  0 ;
16871687}
1688+ 
1689+ /* 
1690+  * NEEDSWORK: The following is a bit of a laundry list of things 
1691+  * that I'd like to add. 
1692+  * 
1693+  * [] Dump stats on all of the packfiles. The number and size of each. 
1694+  * Whether each is in the .git directory or in an alternate.  The state 
1695+  * of the IDX or MIDX files and etc.  Delta chain stats.  All of this 
1696+  * data is relative to the "lived-in" state of the repository.  Stuff 
1697+  * that may change after a GC or repack. 
1698+  * 
1699+  * [] Dump stats on each remote.  When we fetch from a remote the size 
1700+  * of the response is related to the set of haves on the server.  You 
1701+  * can see this in `GIT_TRACE_CURL=1 git fetch`. We get a `ls-refs` 
1702+  * payload that lists all of the branches and tags on the server, so 
1703+  * at a minimum the RefName and SHA for each. But for annotated tags 
1704+  * we also get the peeled SHA.  The size of this overhead on every 
1705+  * fetch is proporational to the size of the `git ls-remote` response 
1706+  * (roughly, although the latter repeats the RefName of the peeled 
1707+  * tag).  If, for example, you have 500K refs on a remote, you're 
1708+  * going to have a long "haves" message, so every fetch will be slow 
1709+  * just because of that overhead (not counting new objects to be 
1710+  * downloaded). 
1711+  * 
1712+  * Note that the local set of tags in "refs/tags/" is a union over all 
1713+  * remotes.  However, since most people only have one remote, we can 
1714+  * probaly estimate the overhead value directly from the size of the 
1715+  * set of "refs/tags/" that we visited while building the `ref_info` 
1716+  * and `ref_array` and not need to ask the remote. 
1717+  * 
1718+  * [] Dump info on the complexity of the DAG.  Criss-cross merges. 
1719+  * The number of edges that must be touched to compute merge bases. 
1720+  * Edge length. The number of parallel lanes in the history that must 
1721+  * be navigated to get to the merge base.  What affects the cost of 
1722+  * the Ahead/Behind computation?  How often do criss-crosses occur and 
1723+  * do they cause various operations to slow down? 
1724+  * 
1725+  * [] If there are primary branches (like "main" or "master") are they 
1726+  * always on the left side of merges?  Does the graph have a clean 
1727+  * left edge?  Or are there normal and "backwards" merges?  Do these 
1728+  * cause problems at scale? 
1729+  * 
1730+  * [] If we have a hierarchy of FI/RI branches like "L1", "L2, ..., 
1731+  * can we learn anything about the shape of the repo around these FI 
1732+  * and RI integrations? 
1733+  */ 
0 commit comments