Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ helix-core-api/
.DS_Store
.vscode/
*.zip
compile_commands.json

# Depot conversion results
clones/
Expand Down
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ option(BUILD_TESTS "Build tests" OFF)

set(CXX_STANDARD_REQUIRED true)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

project(
p4-fusion
Expand Down
53 changes: 43 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,12 @@ These execution times are expected to scale as expected with larger depots (mill
## Usage

```shell
[ PRINT @ Main:56 ] Usage:
[ PRINT @ Main:59 ] Usage:
--branch [Optional, Default is empty]
A branch to migrate under the depot path. May be specified more than once. If at least one is given and the noMerge option is false, then the Git repository will include merges between branches
in the history. You may use the formatting 'depot/path:git-alias', separating the Perforce branch sub-path from the git alias name by a ':'; if the depot path contains a ':', then you must provide
the git branch alias.

--client [Required]
Name/path of the client workspace specification.

Expand All @@ -44,12 +49,6 @@ These execution times are expected to scale as expected with larger depots (mill
--lookAhead [Required]
How many CLs in the future, at most, shall we keep downloaded by the time it is to commit them?

--branch [Optional]
A branch to migrate under the depot path. May be specified more than once. If at least one is given and the noMerge option is false, then the Git repository will include merges between branches in the history. You may use the formatting 'depot/path:git-alias', separating the Perforce branch sub-path from the git alias name by a ':'; if the depot path contains a ':', then you must provide the git branch alias.

--noMerge [Optional, Default is false]
When false and at least one branch is given, then . If this is true, then the Git history will not contain any merges, except for an artificial empty commit added at the root, which acts as a common source to make later merges easier.

--maxChanges [Optional, Default is -1]
Specify the max number of changelists which should be processed in a single run. -1 signifies unlimited range.

Expand All @@ -59,8 +58,11 @@ These execution times are expected to scale as expected with larger depots (mill
--noColor [Optional, Default is false]
Disable colored output.

--noMerge [Optional, Default is false]
Disable performing a Git merge when a Perforce branch integrates (or copies, etc) into another branch.

--path [Required]
P4 depot path to convert to a Git repo
P4 depot path to convert to a Git repo. If used with '--branch', this is the base path for the branches.

--port [Required]
Specify which P4PORT to use.
Expand All @@ -77,6 +79,9 @@ These execution times are expected to scale as expected with larger depots (mill
--src [Required]
Relative path where the git repository should be created. This path should be empty before running p4-fusion for the first time in a directory.

--streamMappings [Optional, Default is false]
Use Mappings defined by Perforce Stream Spec for a given stream

--user [Required]
Specify which P4USER to use. Please ensure that the user is logged in.
```
Expand All @@ -95,6 +100,33 @@ Because Perforce integration isn't a 1-to-1 mapping onto Git merge, there can be

If the Perforce tree contains sub-branches, such as `//base/tree/sub` being a sub-branch of `//base/tree`, then you can use the arguments `--path //base/... --branch tree/sub:tree-sub --branch tree`. The ordering is important here - provide the deeper paths first to have them take priority over the others. Because Git creates branches with '/' characters as implicit directories, you must provide the Git branch alias to prevent Git reporting an error where the branch "tree" can't be created because is already a directory, or "tree/sub" can't be created because "tree" isn't a directory.

## Notes on stream mapping mode

Stream Mapping mode is disabled by default, it makes it so that Perforce Stream Views and Paths [link](https://www.perforce.com/manuals/p4guide/Content/P4Guide/streams.paths.html) are treated as part of the of the depot in which the stream view is mapped. This can include completely out-of-tree depots with no shared path deeper than `//...` it commits everything to the master branch (Using it with branching mode is untested)
lets assume that you have some depots
```
//A/foo/...
//B/bar/...
//C/baz/...
```
and you have some (in perforce) read only monorepo stream with the following mappings
```
import a/... //A/foo/...
import b/... //B/bar/...
import c/... //C/baz/...
```

This will also recursively traverse those depots/streams (you can even just map in a single file and it's fine.) and map in their contents

If you were to run p4-fusion without stream-mapping mode you would be given what could possibly be an empty repo. This will mangle paths to merge all of those imports and treat it as one git repo (almost) seamlessly.

A side effect of stream mapping mode is that you can use stream mappings to exclude paths that no longer exist, and as a result their contents are ignored from git history.

### limitations

- If there is a cycle in the mapping graph this mode will break.
- If the contents of a subdirectory were at one stage part of the parent stream and are now being mapped in (or vice versa) there is a possibility that the files may not exist in the git history. (this is a file-by-file basis if you have several paths mapped into one directory and have files in the parent stream in the same directory, it will be fine provided the file names are all unique to that stream.)

## Checking Results

In order to test the validity of the logic, we need to run the program over a Perforce depot and compare each changelist against the corresponding Git commit SHA, to ensure the files match up.
Expand All @@ -106,12 +138,13 @@ Because of the extra effort the script performs, expect it to take orders of mag
## Build

0. Pre-requisites
* Install openssl@1.0.2t at `/usr/local/ssl` by following the steps [here](https://askubuntu.com/a/1094690).
* Install openssl (both 1.1.1 and 3 work)
* Install CMake 3.16+.
* Install g++ 11.2.0 (older versions compatible with C++11 are also supported).
* Clone this repository or [get a release distribution](https://github.com/salesforce/p4-fusion/releases).
* Get the Helix Core C++ API binaries from the [official Perforce website](https://www.perforce.com/downloads/helix-core-c/c-api).
* Tested versions: 2021.1, 2021.2, 2022.1
* If you are using Openssl3, As of October 12th 2024, you need to manually access the FTP site [here](https://cdist2.perforce.com/perforce/r24.1/bin.linux26x86_64/).
* Tested versions: 2021.1, 2021.2, 2022.1, 2024.1
* We recommend always picking the newest API versions that compile with p4-fusion.
* Extract the contents in `./vendor/helix-core-api/linux/` or `./vendor/helix-core-api/mac/` based on your OS.

Expand Down
59 changes: 56 additions & 3 deletions p4-fusion/branch_set.cc
Original file line number Diff line number Diff line change
Expand Up @@ -106,8 +106,10 @@ std::vector<Branch> createBranchesFromPaths(const std::vector<std::string>& bran
return parsed;
}

BranchSet::BranchSet(std::vector<std::string>& clientViewMapping, const std::string& baseDepotPath, const std::vector<std::string>& branches, const bool includeBinaries)
BranchSet::BranchSet(std::vector<std::string>& clientViewMapping, const std::string& baseDepotPath, const std::vector<std::string>& branches, const std::vector<StreamResult::MappingData>& mappings, const std::vector<StreamResult::MappingData>& exclusions, const bool includeBinaries)
: m_branches(createBranchesFromPaths(branches))
, m_mappings(mappings)
, m_exclusions(exclusions)
, m_includeBinaries(includeBinaries)
{
m_view.InsertTranslationMapping(clientViewMapping);
Expand Down Expand Up @@ -219,11 +221,62 @@ std::unique_ptr<ChangedFileGroups> BranchSet::ParseAffectedFiles(const std::vect
{
continue;
}
// Put logic for Absolutely do not dare map these files here, here :)
std::string relativeDepotPath = stripBasePath(depotFile);
if (relativeDepotPath.empty())
{
// Not under the depot path. Shouldn't happen due to the way we
// scan for files, but...
// Not under regular depot path, might be mapped in
// check and attenuate if it is.
bool isImport = false;
for (auto const& v : m_mappings)
{
if (STDHelpers::EndsWith(v.stream2, "..."))
{
// get rid of trailing ...
auto tempStr = v.stream2.substr(0, v.stream2.size() - 3);
if (STDHelpers::StartsWith(depotFile, tempStr))
{
isImport = true;
// Shove the replacement path at the front
relativeDepotPath = v.stream1.substr(0, v.stream1.size() - 3) + depotFile.substr(tempStr.size());
break;
}
if (depotFile == tempStr)
{
// Map in exactly one file and nothing else.
relativeDepotPath = v.stream1;
isImport = true;
break;
}
}
}

if (!isImport)
{
continue;
}
}

// Check the file or path is not marked as excluded (There is a chance that not all of a mapped in directory is desired, so we have to check post mapping.)
bool discard = false;
for (auto const& v : m_exclusions)
{
if (STDHelpers::EndsWith(v.stream1, "..."))
{
if (STDHelpers::StartsWith(relativeDepotPath, v.stream1.substr(0, v.stream1.size() - 3)))
{
discard = true;
break;
}
}
else if (v.stream1 == relativeDepotPath)
{
discard = true;
break;
}
}
if (discard)
{
continue;
}

Expand Down
199 changes: 101 additions & 98 deletions p4-fusion/branch_set.h
Original file line number Diff line number Diff line change
@@ -1,98 +1,101 @@
/*
* Copyright (c) 2022 Salesforce, Inc.
* All rights reserved.
* SPDX-License-Identifier: BSD-3-Clause
* For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause
*/
#pragma once

#include <string>
#include <vector>
#include <array>
#include <memory>
#include <stdexcept>

#include "commands/file_map.h"
#include "commands/file_data.h"
#include "utils/std_helpers.h"

struct BranchedFileGroup
{
// If a BranchedFiles collection hasSource == true,
// then all files in this collection MUST be a merge
// from the given source branch to the target branch.
// These branch names will be the Git branch names.
std::string sourceBranch;
std::string targetBranch;
bool hasSource;
std::vector<FileData> files;

// Get all the relative file names from each of the file data.
std::vector<std::string> GetRelativeFileNames();
};

struct ChangedFileGroups
{
private:
ChangedFileGroups();

public:
std::vector<BranchedFileGroup> branchedFileGroups;
int totalFileCount;

// When all the file groups have finished being used,
// only then can we safely clear out the data.
void Clear();

ChangedFileGroups(std::vector<BranchedFileGroup>& groups, int totalFileCount);

static std::unique_ptr<ChangedFileGroups> Empty() { return std::unique_ptr<ChangedFileGroups>(new ChangedFileGroups); };
};

struct Branch
{
public:
const std::string depotBranchPath;
const std::string gitAlias;

Branch(const std::string& branch, const std::string& alias);

// splitBranchPath If the relativeDepotPath matches, returns {branch alias, branch file path}.
// Otherwise, returns {"", ""}
std::array<std::string, 2> SplitBranchPath(const std::string& relativeDepotPath) const;
};

// A singular view on the branches and a base view (acts as a filter to trim down affected files).
// Maps a changed file state to a list of resulting branches and affected files.
struct BranchSet
{
private:
// Technically, these should all be const.
const bool m_includeBinaries;
std::string m_basePath;
const std::vector<Branch> m_branches;
FileMap m_view;

// stripBasePath remove the base path from the depot path, or "" if not in the base path.
std::string stripBasePath(const std::string& depotPath) const;

// splitBranchPath extract the branch name and path under the branch (no leading '/' on the path)
// relativeDepotPath - already stripped from running stripBasePath.
std::array<std::string, 2> splitBranchPath(const std::string& relativeDepotPath) const;

public:
BranchSet(std::vector<std::string>& clientViewMapping, const std::string& baseDepotPath, const std::vector<std::string>& branches, const bool includeBinaries);

// HasMergeableBranch is there a branch model that requires integration history?
bool HasMergeableBranch() const { return !m_branches.empty(); };

int Count() const { return m_branches.size(); };

// ParseAffectedFiles create collections of merges and commits.
// Breaks up the files into those that are within the view, with each item in the
// list is its own target Git branch.
// This also has the side-effect of populating the relative path value in the file data.
// ... the FileData object is copied, but it's underlying shared data is shared. So, this
// breaks the const.
std::unique_ptr<ChangedFileGroups> ParseAffectedFiles(const std::vector<FileData>& cl) const;
};
/*
* Copyright (c) 2022 Salesforce, Inc.
* All rights reserved.
* SPDX-License-Identifier: BSD-3-Clause
* For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause
*/
#pragma once

#include <string>
#include <vector>
#include <array>
#include <memory>
#include <stdexcept>

#include "commands/file_map.h"
#include "commands/file_data.h"
#include "commands/stream_result.h"
#include "utils/std_helpers.h"

struct BranchedFileGroup
{
// If a BranchedFiles collection hasSource == true,
// then all files in this collection MUST be a merge
// from the given source branch to the target branch.
// These branch names will be the Git branch names.
std::string sourceBranch;
std::string targetBranch;
bool hasSource;
std::vector<FileData> files;

// Get all the relative file names from each of the file data.
std::vector<std::string> GetRelativeFileNames();
};

struct ChangedFileGroups
{
private:
ChangedFileGroups();

public:
std::vector<BranchedFileGroup> branchedFileGroups;
int totalFileCount;

// When all the file groups have finished being used,
// only then can we safely clear out the data.
void Clear();

ChangedFileGroups(std::vector<BranchedFileGroup>& groups, int totalFileCount);

static std::unique_ptr<ChangedFileGroups> Empty() { return std::unique_ptr<ChangedFileGroups>(new ChangedFileGroups); };
};

struct Branch
{
public:
const std::string depotBranchPath;
const std::string gitAlias;

Branch(const std::string& branch, const std::string& alias);

// splitBranchPath If the relativeDepotPath matches, returns {branch alias, branch file path}.
// Otherwise, returns {"", ""}
std::array<std::string, 2> SplitBranchPath(const std::string& relativeDepotPath) const;
};

// A singular view on the branches and a base view (acts as a filter to trim down affected files).
// Maps a changed file state to a list of resulting branches and affected files.
struct BranchSet
{
private:
// Technically, these should all be const.
const bool m_includeBinaries;
std::string m_basePath;
const std::vector<Branch> m_branches;
const std::vector<StreamResult::MappingData> m_mappings;
const std::vector<StreamResult::MappingData> m_exclusions;
FileMap m_view;

// stripBasePath remove the base path from the depot path, or "" if not in the base path.
std::string stripBasePath(const std::string& depotPath) const;

// splitBranchPath extract the branch name and path under the branch (no leading '/' on the path)
// relativeDepotPath - already stripped from running stripBasePath.
std::array<std::string, 2> splitBranchPath(const std::string& relativeDepotPath) const;

public:
BranchSet(std::vector<std::string>& clientViewMapping, const std::string& baseDepotPath, const std::vector<std::string>& branches, const std::vector<StreamResult::MappingData>& mappings, const std::vector<StreamResult::MappingData>& exclusions, const bool includeBinaries);

// HasMergeableBranch is there a branch model that requires integration history?
bool HasMergeableBranch() const { return !m_branches.empty(); };

int Count() const { return m_branches.size(); };

// ParseAffectedFiles create collections of merges and commits.
// Breaks up the files into those that are within the view, with each item in the
// list is its own target Git branch.
// This also has the side-effect of populating the relative path value in the file data.
// ... the FileData object is copied, but it's underlying shared data is shared. So, this
// breaks the const.
std::unique_ptr<ChangedFileGroups> ParseAffectedFiles(const std::vector<FileData>& cl) const;
};
Loading
Loading