update implementation for 2020 spec changes

Mergeddmitshur opened this change 5 months ago
dmitshur commented 5 months ago

With the new specification, it's possible to stop reading a file early even when the generated comment is absent, since the generated comment may no longer appear anywhere in the file. Begin parsing /* */ comment blocks, which may span multiple lines, to be able to be able to detect when we run into the first non-comment, non-blank text in the file.

It was mentioned in

[...] the primary focus for these comments is Go code and more generally Go package sources.
Go package sources today can be Go, assembly, C, C++, Objective C, SWIG, and Fortran.
The comment being discussed here, like the //go:build comment, handles all of these these except Fortran, which seems fine.

The initial scope of this package was to support parsing Go source only, but it happens to work well with many other source types that are found in Go packages. Replace mentions of "a Go source file" in documentation and code with "a source file", to make it apply more generally. If this will mean we need to parse comments differently depending on the source type in the future, we can revisit this decision. Let's see how it goes.

Fixes issue 1.

Write Preview Markdown
dmitshur requested a review from dominikh 1 month ago
dmitshur uploaded Patch Set 2 1 month ago
dmitshur commented 1 month ago
generated.go:4 (Patch Set 1) · 1 month ago

Shortened '' to '' here, and on line 24.

Write Preview Markdown
dominikh commented 1 month ago


Commit Message:7 (Patch Set 1) · 1 month ago

Should capitalize sentence :P

Write Preview Markdown
generated.go:59 (Patch Set 1) · 1 month ago

given at the start of the line and updated at the end of the line

I know what you mean, but I find the wording confusing. To be honest I'd probably not use a pointer and instead return the new state of inBlock.

Write Preview Markdown
dmitshur uploaded Patch Set 3 1 month ago
dmitshur commented 1 month ago
Commit Message:7 (Patch Set 1) · 1 month ago

I'm following the Go style for commit messages, which chooses to do:

A rule of thumb is that it should be written so to complete the sentence "This change modifies Go to _____." That means it does not start with a capital letter, is not a complete sentence, and actually summarizes the result of the change.

(From and emphasis mine.)

Write Preview Markdown
fuzz_test.go:2 (Patch Set 1) · 1 month ago

Go fuzzing support is out of beta and no longer uses this build constraint. It should be replaced with go1.18 by now.

Done in PS 3.

Write Preview Markdown
generated.go:59 (Patch Set 1) · 1 month ago · edited

I replaced the confusing description here with a better comment for var inBlock bool in Parse.

When writing this code originally, I started with returning the new inBlock value. It got awkward because containsNonComment already returns a bool, so it started needing named result parameters, but inBlock couldn't be reused, and another name becomes confusing. Similarly, dealing with two results in the caller made it clunky.

I also explored making containsNonComment a method, so it could just update its field, but that added more verbosity compared to just making this one bool a pointer. So, I don't like using a pointer either, but not using it didn't end up working out better.

Ultimately, there's not much point for Parse to break input into lines via ReadLines('\n') and process them separately. By now it makes more sense to just do everything in one pass, reading byte-by-byte, while keeping track of /* */ and // style comments, until the first non-comment byte. When that optimization happens, the pointer will go away.
Write Preview Markdown
dominikh reviewed +1 1 month ago
dmitshur merged commit c5b6cf572ec544b54bfbb13fb97e57907e0aec5b into master 3 weeks ago
Write Preview Markdown