Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per framework caching #35

Open
kingfai opened this issue Feb 3, 2017 · 13 comments
Open

Per framework caching #35

kingfai opened this issue Feb 3, 2017 · 13 comments

Comments

@kingfai
Copy link
Contributor

kingfai commented Feb 3, 2017

Hi @guidomb,
cc: @mokagio , @Dschee
I'd like to restart the discussion that was in issue #12 .

My organization has multiple projects which share dependencies including in house dependencies. When we publish a new release of an in house dependency, we would like to publish it to the cache for all other projects to use. We would need to enhance carthage_cache to also support publishing and downloading frameworks individually.

I'd like to propose that we follow option #1 as discussed here: #12 (comment)

As for the issue about not having a way to go from a built framework back to a Cartfile.resolved entry (#12 (comment)), I'm thinking we can include an optional repository mapping in .carthage_cache.yml. This is similar to how Rome solves this problem.

If we have the go ahead to implement this enhancement, I'll volunteer to do most of the implementation.

Your thoughts?

@guidomb
Copy link

guidomb commented Feb 3, 2017

How the repository mapping would look like?

@kingfai
Copy link
Contributor Author

kingfai commented Feb 4, 2017

How about something like this?

---
:bucket_name: my-carthage-cache
:prune_on_publish: false
:aws_s3_client_options:
  :region: us-west-2
  :access_key_id: KEY123
  :secret_access_key: SECRET123
:repository_map:
  :Charts: git "[email protected]:danielgindi/ios-charts.git"
  :Realm: github "realm/realm-cocoa"
  :RealmSwift: github "realm/realm-cocoa"

@guidomb
Copy link

guidomb commented Feb 4, 2017

I think this feature should take into consideration two current features.

1 - Currently carthage cache provides a mechanism to "white list" dependencies to not be pruned by the prune command. This is usually needed when the name of the framework does not match the name of the git project. Check the last example in the usage section and this spec. The repository mapping could replace the white list file.

2 - Carthage cache has a lock file and a validate command the is usually pretty helpful when you pull the latest changes from the repo and you want to make sure the right cache has been installed. Check this method. This feature should continue to work.

@Jeehut
Copy link

Jeehut commented Feb 4, 2017

@kingfai I don't want to stop you from implementing this, but I think I should point you to the fact that there is work in progress in the Carthage framework itself to only rebuild frameworks that are changed, which I think for most people should solve the problem in Carthage itself rather than in this framework. Carthage_Cache would then only be useful in testing environments (like the CI) or for new people entering a project in big dependency setups.

The problem is, no one knows when that work will be finished and it's being worked on quite some time already. I'm seeing it getting near its end though in the last weeks, so my best guess is that it's gonna be merged in about two months or so.

Here's the link: Carthage/Carthage#1489

@kingfai
Copy link
Contributor Author

kingfai commented Feb 6, 2017

Hi @Dschee ,

Thanks for pointing out Carthage/Carthage#1489.

Avoiding rebuilding frameworks unnecessarily will definitely help speed up developer work flows. For my organization, large dependency setups and the time it takes to copy down the dependencies for a clean build would still be an issue.

Nonetheless, I'm looking forward to that change and hope it comes in weeks instead of months!

@kingfai
Copy link
Contributor Author

kingfai commented Feb 8, 2017

Here's a proposal of the changes I'd like to implement:

  • Configuration
    • granular_caching will be OFF by default.
    • Turn on by
      • Adding a –granular_cache command line option to the publish and install commands or
      • Adding granular_cache: true to .carthage_cache.yml
    • A new optional repository_map section in .carthage_cache.yaml will map framework names to repo names
:repository_map:
  :Charts: git "[email protected]:danielgindi/ios-charts.git"
  :Realm: github "realm/realm-cocoa"
  :RealmSwift: github "realm/realm-cocoa"

The repository_map could be used to replace the existing "white list" functionality.

  • carthage_cache publish (upload) behavior
    • Existing behavior of uploading zip file of all Cartfile.resolved dependencies is preserved.
    • If granular_caching is enabled, carthage_cache will also upload a zip file for each framework that is not already in the cache.
  • carthage_cache install (download) behavior
    • Existing behavior of attempting to download a zip file for Cartfile.resolved is preserved.
    • If granular_caching is on and Cartfile.resolved zip file is not found, carthage_cache will attempt to download each framework from the cache. If will then call carthage bootstrap for any frameworks that are missing from the cache.
  • carthage_cache validate behavior
    • Should continue to work as it does today
  • carthage_cache list
    • New command
    • Lists frameworks in the cache and reports cache misses/hits, according to the local Carftfile.resolved. Ignores dSYMs.
    • –missing flag will list only missing frameworks
  • Format of data being stored on S3
    • Format for zip file of Cartfile.resolved files remains the same: ${s3_bucket}/${cartfile_resolved_hash}.zip. E.g.,
      • s3://carthage_cache/39143ebd4b14b7d6d894ee22af4456ee07f6e8af72415d4f07564339aee9fafd.zip
    • Format for storing individual frameworks will be: ${s3_bucket}/${repo_name}/${framework_name}-${framework_version}.zip. Example:
carthage_cache/
├── 39143ebd4b14b7d6d894ee22af4456ee07f6e8af72415d4f07564339aee9fafd.zip
└── Build
    ├── Mac
    │   ├── Charts
    │   │   ├── Charts-2.3.1.zip
    │   │   └── Charts-3.0.1.zip
    │   └── realm-cocoa
    │       ├── Realm-2.4.1.zip
    │       ├── Realm-2.4.2.zip
    │       ├── RealmSwift-2.4.1.zip
    │       └── RealmSwift-2.4.2.zip
    ├── iOS
    │   ├── Charts
    │   │   ├── Charts-2.3.1.zip
    │   │   └── Charts-3.0.1.zip
    │   └── realm-cocoa
    │       ├── Realm-2.4.1.zip
    │       ├── Realm-2.4.2.zip
    │       ├── RealmSwift-2.4.1.zip
    │       └── RealmSwift-2.4.2.zip
    └── tvOS
        ├── Charts
        │   ├── Charts-2.3.1.zip
        │   └── Charts-3.0.1.zip
        └── realm-cocoa
            ├── Realm-2.4.1.zip
            ├── Realm-2.4.2.zip
            ├── RealmSwift-2.4.1.zip
            └── RealmSwift-2.4.2.zip

Let me know what you think.

@guidomb
Copy link

guidomb commented Feb 8, 2017

@kingfai Awesome! Thanks for such detailed description. My comments

  • Could you explain how the prune command would work if granular_caching is disabled?

  • I am not quite sure if uploading both full archive and individual builds when granular_caching is enabled is a good option. Why do you think this is necessary? My main concern here is regarding bandwidth and upload time.

  • if granular_caching is on and Cartfile.resolved zip file is not found, carthage_cache will attempt to download each framework from the cache. If will then call carthage bootstrap for any frameworks that are missing from the cache.

I don't understand in which case Cartfile.resolved zip won't be found. Because as far as I understood you would always upload the full archive.

  • How would the validate command work if granular_caching is enabled?

I am a little concerned about the complexity introduced by having to maintain two modes but I cannot think of better alternative. Because another option would be to introduce logic in the backend and generate an archive by demand.

For example, a project publishes its full cache archive to a backend service who is responsible of unarchiving the zip file, store each precompiled framework indexed by their resolved version and the version of the compiler that was used. Then when another project wants to use Carthage cache it could send its Cartfile.resolved plus the compiler version and the backend could generate a cache archive on the fly by picking each precompiled framework and assembling a full archive for the given Cartfile.resolved / compiler version pair. Such backend could also cache this assembled archive and return it when the archive was requested by other client. The backend could also tell the client if it doesn't have all frameworks and the client could decide to download a partial cache and compile the missing frameworks and then upload the newly compiled frameworks to the backend. But this approach also adds a lot of complexity too.

Or maybe we could have combination of both option. Where the client only uploads the full archive and a lambda function, unzips the archive and saves the individual frameworks. If then a client uses granular_caching the behavior you described remains valid.

@guidomb
Copy link

guidomb commented Feb 8, 2017

Also how would generate such repository map? Would carthage cache generate automatically?

@kingfai
Copy link
Contributor Author

kingfai commented Feb 8, 2017

No problem @guidomb!

Here are some of my thoughts:

Could you explain how the prune command would work if granular_caching is disabled?
I took at look at the code for the prune command and played with it. It seems to delete any framework which is not listed in the Cartfile.resolved file or whitelist file. This seemed to delete some of my frameworks where the framework name did not match the repo name.

I'm thinking the prune command should just use the repository_map as the whitelist regardless of whether or not granular_caching is enabled or not. Is there an edge case where someone would not want to list something on the repository_map but wants it whitelisted?

I am not quite sure if uploading both full archive and individual builds when granular_caching is enabled is a good option. Why do you think this is necessary? ...

I don't understand in which case Cartfile.resolved zip won't be found. Because as far as I understood you would always upload the full archive. ...

Those are good points. I think I was following the initial idea from #12 (comment). Perhaps it would be cleaner if we just have 2 modes as you mentioned. If one uses granular_cache then we do not upload the Cartfile.resolved zip. The takes care of the bandwidth concern and simplifies the logic when downloading.

How would the validate command work if granular_caching is enabled?

The validate command could compare the contents of the Build directory with what is in Cartfile.resolved + repository_map and fail if something is missing in Build.

Also how would generate such repository map? Would carthage cache generate automatically?

I think initially, a user would have to use the list command to figure out which frameworks are in the Build directory but missing from the cache. She would have to figure out the mapping and add it to the repository_map. It would be an additional sizeable enhancement to generate the repository_map automatically. We would probably have to crawl the Carthage/Checkouts dir, extract the framework name from each Xcode project and assemble a repo map.

Your thoughts?

@kingfai
Copy link
Contributor Author

kingfai commented Feb 15, 2017

Hi @guidomb ,

Do you have any preferences on how I submit the PR(s) for this feature? It's going to be pretty large. I could submit the changes in chunks like so if it makes sense to you:

  • granular_cache configuration
  • uploading
  • downloading

@guidomb
Copy link

guidomb commented Feb 15, 2017 via email

@guidomb
Copy link

guidomb commented Feb 27, 2017

Sorry for disappearing. I wanted to let you know that Carthage has released version 0.20.0 which adds a --cache-builds flag. This would some most of the problem carthage_cache and this particular issue workaround. I think carthage cache still makes sense for "cold" bootstrap or if you use a services like TravisCI for continuous integration. That is why I think it is better not to add complexity to carthage_cache. Ideally Carthage will improve, framework authors will upload precompiled binaries to GitHub and CarthageCache would not have reason to exist.

@kingfai Thanks a lot for try to tackle this problem. I'm sorry for the time you have invested in but I strongly believe this is the best option moving forward. I'll keep this issue open for a while in case I'm missing something.

@kingfai
Copy link
Contributor Author

kingfai commented Mar 29, 2017

Just a heads up that I'm pretty much done implementing this feature in house.

If things change, and we end up wanting to look at the changes, I'd be happy to share it with everyone on my fork.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants