本站源代码
您最多选择25个主题 主题必须以字母或数字开头,可以包含连字符 (-),并且长度不得超过35个字符

303 行
8.1KB

  1. package git
  2. import (
  3. "bytes"
  4. "errors"
  5. "fmt"
  6. "strconv"
  7. "strings"
  8. "time"
  9. "unicode/utf8"
  10. "gopkg.in/src-d/go-git.v4/plumbing"
  11. "gopkg.in/src-d/go-git.v4/plumbing/object"
  12. "gopkg.in/src-d/go-git.v4/utils/diff"
  13. )
  14. // BlameResult represents the result of a Blame operation.
  15. type BlameResult struct {
  16. // Path is the path of the File that we're blaming.
  17. Path string
  18. // Rev (Revision) is the hash of the specified Commit used to generate this result.
  19. Rev plumbing.Hash
  20. // Lines contains every line with its authorship.
  21. Lines []*Line
  22. }
  23. // Blame returns a BlameResult with the information about the last author of
  24. // each line from file `path` at commit `c`.
  25. func Blame(c *object.Commit, path string) (*BlameResult, error) {
  26. // The file to blame is identified by the input arguments:
  27. // commit and path. commit is a Commit object obtained from a Repository. Path
  28. // represents a path to a specific file contained into the repository.
  29. //
  30. // Blaming a file is a two step process:
  31. //
  32. // 1. Create a linear history of the commits affecting a file. We use
  33. // revlist.New for that.
  34. //
  35. // 2. Then build a graph with a node for every line in every file in
  36. // the history of the file.
  37. //
  38. // Each node is assigned a commit: Start by the nodes in the first
  39. // commit. Assign that commit as the creator of all its lines.
  40. //
  41. // Then jump to the nodes in the next commit, and calculate the diff
  42. // between the two files. Newly created lines get
  43. // assigned the new commit as its origin. Modified lines also get
  44. // this new commit. Untouched lines retain the old commit.
  45. //
  46. // All this work is done in the assignOrigin function which holds all
  47. // the internal relevant data in a "blame" struct, that is not
  48. // exported.
  49. //
  50. // TODO: ways to improve the efficiency of this function:
  51. // 1. Improve revlist
  52. // 2. Improve how to traverse the history (example a backward traversal will
  53. // be much more efficient)
  54. //
  55. // TODO: ways to improve the function in general:
  56. // 1. Add memoization between revlist and assign.
  57. // 2. It is using much more memory than needed, see the TODOs below.
  58. b := new(blame)
  59. b.fRev = c
  60. b.path = path
  61. // get all the file revisions
  62. if err := b.fillRevs(); err != nil {
  63. return nil, err
  64. }
  65. // calculate the line tracking graph and fill in
  66. // file contents in data.
  67. if err := b.fillGraphAndData(); err != nil {
  68. return nil, err
  69. }
  70. file, err := b.fRev.File(b.path)
  71. if err != nil {
  72. return nil, err
  73. }
  74. finalLines, err := file.Lines()
  75. if err != nil {
  76. return nil, err
  77. }
  78. // Each node (line) holds the commit where it was introduced or
  79. // last modified. To achieve that we use the FORWARD algorithm
  80. // described in Zimmermann, et al. "Mining Version Archives for
  81. // Co-changed Lines", in proceedings of the Mining Software
  82. // Repositories workshop, Shanghai, May 22-23, 2006.
  83. lines, err := newLines(finalLines, b.sliceGraph(len(b.graph)-1))
  84. if err != nil {
  85. return nil, err
  86. }
  87. return &BlameResult{
  88. Path: path,
  89. Rev: c.Hash,
  90. Lines: lines,
  91. }, nil
  92. }
  93. // Line values represent the contents and author of a line in BlamedResult values.
  94. type Line struct {
  95. // Author is the email address of the last author that modified the line.
  96. Author string
  97. // Text is the original text of the line.
  98. Text string
  99. // Date is when the original text of the line was introduced
  100. Date time.Time
  101. // Hash is the commit hash that introduced the original line
  102. Hash plumbing.Hash
  103. }
  104. func newLine(author, text string, date time.Time, hash plumbing.Hash) *Line {
  105. return &Line{
  106. Author: author,
  107. Text: text,
  108. Hash: hash,
  109. Date: date,
  110. }
  111. }
  112. func newLines(contents []string, commits []*object.Commit) ([]*Line, error) {
  113. lcontents := len(contents)
  114. lcommits := len(commits)
  115. if lcontents != lcommits {
  116. if lcontents == lcommits-1 && contents[lcontents-1] != "\n" {
  117. contents = append(contents, "\n")
  118. } else {
  119. return nil, errors.New("contents and commits have different length")
  120. }
  121. }
  122. result := make([]*Line, 0, lcontents)
  123. for i := range contents {
  124. result = append(result, newLine(
  125. commits[i].Author.Email, contents[i],
  126. commits[i].Author.When, commits[i].Hash,
  127. ))
  128. }
  129. return result, nil
  130. }
  131. // this struct is internally used by the blame function to hold its
  132. // inputs, outputs and state.
  133. type blame struct {
  134. // the path of the file to blame
  135. path string
  136. // the commit of the final revision of the file to blame
  137. fRev *object.Commit
  138. // the chain of revisions affecting the the file to blame
  139. revs []*object.Commit
  140. // the contents of the file across all its revisions
  141. data []string
  142. // the graph of the lines in the file across all the revisions
  143. graph [][]*object.Commit
  144. }
  145. // calculate the history of a file "path", starting from commit "from", sorted by commit date.
  146. func (b *blame) fillRevs() error {
  147. var err error
  148. b.revs, err = references(b.fRev, b.path)
  149. return err
  150. }
  151. // build graph of a file from its revision history
  152. func (b *blame) fillGraphAndData() error {
  153. //TODO: not all commits are needed, only the current rev and the prev
  154. b.graph = make([][]*object.Commit, len(b.revs))
  155. b.data = make([]string, len(b.revs)) // file contents in all the revisions
  156. // for every revision of the file, starting with the first
  157. // one...
  158. for i, rev := range b.revs {
  159. // get the contents of the file
  160. file, err := rev.File(b.path)
  161. if err != nil {
  162. return nil
  163. }
  164. b.data[i], err = file.Contents()
  165. if err != nil {
  166. return err
  167. }
  168. nLines := countLines(b.data[i])
  169. // create a node for each line
  170. b.graph[i] = make([]*object.Commit, nLines)
  171. // assign a commit to each node
  172. // if this is the first revision, then the node is assigned to
  173. // this first commit.
  174. if i == 0 {
  175. for j := 0; j < nLines; j++ {
  176. b.graph[i][j] = b.revs[i]
  177. }
  178. } else {
  179. // if this is not the first commit, then assign to the old
  180. // commit or to the new one, depending on what the diff
  181. // says.
  182. b.assignOrigin(i, i-1)
  183. }
  184. }
  185. return nil
  186. }
  187. // sliceGraph returns a slice of commits (one per line) for a particular
  188. // revision of a file (0=first revision).
  189. func (b *blame) sliceGraph(i int) []*object.Commit {
  190. fVs := b.graph[i]
  191. result := make([]*object.Commit, 0, len(fVs))
  192. for _, v := range fVs {
  193. c := *v
  194. result = append(result, &c)
  195. }
  196. return result
  197. }
  198. // Assigns origin to vertexes in current (c) rev from data in its previous (p)
  199. // revision
  200. func (b *blame) assignOrigin(c, p int) {
  201. // assign origin based on diff info
  202. hunks := diff.Do(b.data[p], b.data[c])
  203. sl := -1 // source line
  204. dl := -1 // destination line
  205. for h := range hunks {
  206. hLines := countLines(hunks[h].Text)
  207. for hl := 0; hl < hLines; hl++ {
  208. switch {
  209. case hunks[h].Type == 0:
  210. sl++
  211. dl++
  212. b.graph[c][dl] = b.graph[p][sl]
  213. case hunks[h].Type == 1:
  214. dl++
  215. b.graph[c][dl] = b.revs[c]
  216. case hunks[h].Type == -1:
  217. sl++
  218. default:
  219. panic("unreachable")
  220. }
  221. }
  222. }
  223. }
  224. // GoString prints the results of a Blame using git-blame's style.
  225. func (b *blame) GoString() string {
  226. var buf bytes.Buffer
  227. file, err := b.fRev.File(b.path)
  228. if err != nil {
  229. panic("PrettyPrint: internal error in repo.Data")
  230. }
  231. contents, err := file.Contents()
  232. if err != nil {
  233. panic("PrettyPrint: internal error in repo.Data")
  234. }
  235. lines := strings.Split(contents, "\n")
  236. // max line number length
  237. mlnl := len(strconv.Itoa(len(lines)))
  238. // max author length
  239. mal := b.maxAuthorLength()
  240. format := fmt.Sprintf("%%s (%%-%ds %%%dd) %%s\n",
  241. mal, mlnl)
  242. fVs := b.graph[len(b.graph)-1]
  243. for ln, v := range fVs {
  244. fmt.Fprintf(&buf, format, v.Hash.String()[:8],
  245. prettyPrintAuthor(fVs[ln]), ln+1, lines[ln])
  246. }
  247. return buf.String()
  248. }
  249. // utility function to pretty print the author.
  250. func prettyPrintAuthor(c *object.Commit) string {
  251. return fmt.Sprintf("%s %s", c.Author.Name, c.Author.When.Format("2006-01-02"))
  252. }
  253. // utility function to calculate the number of runes needed
  254. // to print the longest author name in the blame of a file.
  255. func (b *blame) maxAuthorLength() int {
  256. memo := make(map[plumbing.Hash]struct{}, len(b.graph)-1)
  257. fVs := b.graph[len(b.graph)-1]
  258. m := 0
  259. for ln := range fVs {
  260. if _, ok := memo[fVs[ln].Hash]; ok {
  261. continue
  262. }
  263. memo[fVs[ln].Hash] = struct{}{}
  264. m = max(m, utf8.RuneCountInString(prettyPrintAuthor(fVs[ln])))
  265. }
  266. return m
  267. }
  268. func max(a, b int) int {
  269. if a > b {
  270. return a
  271. }
  272. return b
  273. }
上海开阖软件有限公司 沪ICP备12045867号-1