Forums

Full Version: Distributed version control systems (DVCS)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
During the recent IRC meeting we discussed adopting a DVCS system: http://vdrift.net/Forum/viewtopic.php?p=11684#11684

The consensus seemed to be that we should look into Git.

The next logical decision to make is whether or not we should move some portion of our project hosting to GitHub.

I think this is a good idea, and we should move over everything but the wiki, data repository, main site/forums, and cars site. That means the following would move to GitHub: source code version control (currently svn.vdrift.net), issue tracker (currently Google Code), and project metrics (currently Ohloh). We should make the move while we are working on the release, or directly after the next release.

Please discuss. If GitHub sounds good to everyone I can go ahead and setup a VDrift project there.

Here's an interesting link: http://www.infoq.com/articles/dvcs-guide
I've only been using github as a sort of experiment, because it provides a very straightforward user-oriented repository layout.

You may want to investigate the git hosting available on sourceforge.net, seeing as the project already has data svn there.
https://sourceforge.net/apps/trac/sourceforge/wiki/Git
Google code is another alternative; they support mercurial and have a great issue tracker.

The github code review tool looks awesome, though.
It makes sense to think about workflow with DVCS. Here's a set of blog entries that cover a reasonable-looking workflow using GitHub.

Part 1
http://www.silverwareconsulting.com/inde...troduction

Part 2
http://www.silverwareconsulting.com/inde...ng-Started

Part 3
http://www.silverwareconsulting.com/inde...oping-Code

Part 4
http://www.silverwareconsulting.com/inde...tributions
Here's some stuff about the GitHub issue tracker:
https://github.com/blog/411-github-issue-tracker


fudje Wrote:You may want to investigate the git hosting available on sourceforge.net, seeing as the project already has data svn there.
Basically, the question is whether we want it to be more project-oriented or user-oriented. SourceForge is more project-oriented, where a project would host a single repository, users can clone/branch, and admins can authorize the users which are allowed to write (i.e. pull from a branch to master). GitHub is more user-oriented, where any user can make a fork of a public repository, and then submit a pull request to the original repository if/when they are ready for their changes to go upstream.

SourceForge Git docs:
https://sourceforge.net/apps/trac/sourceforge/wiki/Git

GitHub Help:
http://help.github.com/
I'd say that workflow is over-complicated, at least for git.

Firstly, it's never necessary to keep a local branch of a mainline development tree that you don't touch, because git keeps track of remote branches at the same level as it does local branches. So instead of having remote master (read-only) > remote master (read-write) <> local master <> local topic1..n you can have something like remote master (read-only) > remote master (read-write) <> local topic1..n -- If you want to test the remote master without your changes, you can commit your current changes to your local branch and checkout the remote branch or master directly (into what git refers to as a "detached HEAD"). Similarly you can compare remote and local branches directly using eg. 'git diff'. This won't cost you any extra bandwidth, when you want to get a new copy of the remote branches, you can do so without affecting any local branch whatsoever.

Another thing to note is that superfluous use of rebase is not generally recommended. The reason for this is that what a rebase actually does is it rewrites the history of your local branch so that all of your local changes appear to happen after any remote changes that occurred, and if there are confliciting changes, ie. you and another developer working on the same region of a file, it quickly becomes difficult to recognise where the conflicts were resolved. Rebasing is primarily useful for branches where you pull the remote branch, perform and commit all your changes, get any updates from the remote one time, fix any conflicts, submit a pull request, and never touch that branch again. If you are performing a lot of changes to one branch over time, merging the remote branch in at frequent intervals and fixing merge conflicts along the way gives a clear indication of what had to be changed at what point, reducing the amount of work that has to be done when your branch is merged back into the main development tree.

On a bit of a tangent here, I'd also recommend against creating a fork using github's tools until you're ready to push some changes to it. Git can easily handle multiple remotes, allowing a mix of read-only and writable, and this also makes it easier to avoid having redundant copies of the master branch on your git remote. This is an almost entirely different workflow, though.
Eek reply spam. :oops:

As the user-oriented github vs. project oriented sourceforge, I guess having the project on github in the first place does make pull requests a lot easier for most people Smile
Thanks for the info fudje, your experience is helpful. Reply all you like.

Here's something I'm interested in hearing about. Let's pretend for a moment that VDrift was already on GitHub (with a few branches: master, development, and last-release) when you started your fork there [1].

What kind of workflow would you have used?

What kind of workflow could NaN have used to incorporate your changes into the development branch?

[1] https://github.com/fjwhittle/vdrift
fudje Wrote:As the user-oriented github vs. project oriented sourceforge, I guess having the project on github in the first place does make pull requests a lot easier for most people Smile

I think the user-oriented "encourage forking" paradigm aligns very well with the idea of this project being a sandbox for free experimentation by developers, as mentioned by Joe here: http://vdrift.net/Forum/viewtopic.php?p=11699#11699

Joe, any thoughts on that?
Quote:What kind of workflow would you have used?
What kind of workflow could NaN have used to incorporate your changes into the development branch?
He would send me a pull request. More info: http://help.github.com/pull-requests/
I had a much much longer reply, but my browser crashed and ate it, wasting about an hour of careful proof checking to make sure I had all the details right. I'll do a more detailed blog post later Wink

thelusiv Wrote:Here's something I'm interested in hearing about. Let's pretend for a moment that VDrift was already on GitHub (with a few branches: master, development, and last-release) when you started your fork there [1].

What kind of workflow would you have used?

I'd already have a copy of the working true, so I'd make a branch locally.
I do my changes in that branch, committing locally very frequently, getting new change with git fetch origin; git rebase less frequently, but probably at least daily. Yes I know I said don't use rebase too much, keep reading.
When I want to publish my changes, having rewritten my change history as often as is convenient, fork the project on GitHub, set up the remote. Push it, make the beats go farther.
Submit a pull request. This is basically a memorandum to the project saying "get my changes please." Using GitHub, this creates a "Pull Request" "Issue" that anyone can comment on. Otherwise, it's just like any email.

thelusiv Wrote:What kind of workflow could NaN have used to incorporate your changes into the development branch?

Add the remote, pull changes into a new branch (on his local repo), Ideally perform necessary changes (eg. if I interpreted the coding style wrong in places) in that branch and merge it into development, then push to the main repositroy on github. Alternatively, if there are some commit's he's happy with and others that can't be used for some reason, git has a cherry-pick command that allows him to import particular commits into the development branch and make further changes where necessary.
As a foolish last resort, it's also possible to directly compare the two branches, import all the changes in one commit, and bugger up everyone's change history. He wouldn't want to do that, though Wink

thelusiv probably also should have Wrote:You made a boo-boo, or have further changes or something, what do you do now?
This is where the "merge is better than rebase" comes into play. Let's imagine that the main development branch has all my changes, and some corrections. If I now do a rebase off that branch it's going to create lots of confusion and possibly install some merge conflicts that make no sense — although the biggest problem I'd face would come when it's time to push again, because git will complain mightly about a the remote branch not being an ancestor, and while you can force it through it means that anyone pulling my branch into the main repo again is going to face the same problems. So, the very first thing I should do, is merge the development branch again locally. I can now make all the changes I want, but it would be very painful to try ever rebasing on this branch again, because every time I try to rebase against the remote development the first thing I'll have to work around is the possible conflicts of that merge. So when I want to update, I merge again. This has the effect that every time I sync my branch with the main development one, a new merge commit is visible in my branch and anything that pulls from it. But don't worry, git is designed to work with these — it won't duplicate commits.
In any case, probably the best strategy is to merge once at the beginning of a block of work, and once again before pushing it to my github repo, thus minimizing the number of conflict resolution (large snapshot) commits.[/list]
About rebase:
Quote:And sure enough, the git-rebase manpage says, “When you rebase a branch, you are changing its history in a way that will cause problems for anyone who already has a copy of the branch in their repository and tries to pull updates from you.”

I maintain, therefore, that git-rebase is evil and should be avoided. It only works for a situation where someone maintains a private branch of a project, never shared in any way except to submit patches to an upstream. Forget it if you have a team maintaining that branch, or want to post that branch online for others to help with (as I do with my Debian darcs package). Even if you keep it private now, do you really want to adopt a work process that forces you to keep it private forever, or else completely change how you work?
So if git-rebase only has limited usefulness, what is a better way to cleanly merge commits from upstream into a branch?
cherry-pick them?
I believe Torvalds echoed those sentiments about rebasing but louder and with more swear words.
To simplify everything said about it so far: It is almost always better to merge than to rebase — to the unitiated it looks ugly, but it is a much cleaner method in that it's obvious where adjustments had to be made to keep the code in sync cleanly.

NaN Wrote:cherry-pick them?

Possible, tedious, silly? (You did mean as a sync method, not a request for explanation, yes?)
Pages: 1 2