Using the gitlab-python package, I’d like to extract lines from all Dockerfiles. Using my code below, I am able to get project names and url to the repo I want but how can I ensure there is a Dockerfile and read the contents of the Dockerfile.
import gitlab import json from pprint import pprint import requests import urllib.request # private token authentication gl = gitlab.Gitlab('<path_to_gitlab_repo>', private_token=<token_here>) gl.auth() # list all projects projects = gl.projects.list() for project in projects: # print(project) # prints all the meta data for the project print("Project: ", project.name) print("Gitlab URL: ", project.http_url_to_repo) # print("Branches: ", project.repo_branches) pprint(project.repository_tree(all=True)) f = urllib.request.urlopen(project.http_url_to_repo) myfile = f.read() print(myfile) print("nn")
The output I get now is :
Gitlab URL: <path_to_gitlab_repo> [{'id': '0c4a64925f5c129d33557', 'mode': '1044', 'name': 'README.md', 'path': 'README.md', 'type': 'blob'}]
Advertisement
Answer
You can use the project.files.get()
method (see documentation) to get the Dockerfile of the project.
You can then print the content of the Dockerfile/do whatever you want to do with it like this:
import gitlab import base64 # private token authentication gl = gitlab.Gitlab(<gitlab-url>, private_token=<private-token>) gl.auth() # list all projects projects = gl.projects.list(all=True) for project in projects: # print(project) # prints all the meta data for the project # print("Project: ", project.name) # print("Gitlab URL: ", project.http_url_to_repo) # Skip projects without branches if len(project.branches.list()) == 0: continue branch = project.branches.list()[0].name try: f = project.files.get(file_path='Dockerfile', ref=branch) except gitlab.exceptions.GitlabGetError: # Skip projects without Dockerfile continue file_content = base64.b64decode(f.content).decode("utf-8") print(file_content.replace('\n', 'n'))
You might have to adjust the branch name in case there are multiple branches.