Fix bugs with unicode filenames in runfiles.
We had a cc_test using a bunch of files (`data = glob(["data/**"]),`), some which were in subdirectories where the directory name had unicode characters (e.g. `data/test_öΩ/`).
This resulted in an error:
```
ERROR: C:/users/.../BUILD.bazel:233:8: Creating runfiles tree bazel-out/x64_windows-opt/test-shared.exe.runfiles failed: build-runfiles.exe failed: error executing command
cd /d C:\users\...
SET BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1
SET PATH=...
C:\users\...\install\eadbc20bd36081cbe58dcb5129383507\build-runfiles.exe bazel-out/x64_windows-opt/test-shared.exe.runfiles_manifest bazel-out/x64_windows-opt/test-shared.exe.runfiles: Process exited with status 1: Process exited with status 1
build-runfiles error: MakeDirectoriesW failed (\\?\c:\users\...\test-shared.exe.runfiles\test\data\layer_middle_??): (error: 123): The filename, directory name, or volume label syntax is incorrect.
```
`build-runfiles-windows.cc` expects the `.runfiles_manifest` to be encoded in UTF-8; but Java was writing it as latin-1. This resulted in unicode characters (not representable in latin-1) being stored as `?` in the `.runfiles_manifest` file. Question marks are not valid in filenames, causing the error in `build-runfiles`.
This commit changes the encoding of `.runfiles_manifest` from latin-1 to UTF-8. Additionally, it fixes a bug in `build-runfiles-windows.cc` where the `space_pos` was calculated from the UTF-8 representation but then used for the UTF-16 wstring, which caused string containing unicode characters to be split incorrectly.
Closes #15846.
PiperOrigin-RevId: 509492501
Change-Id: I82d5119160c16d4513e9e744d9f398727bac1dc4
diff --git a/src/main/java/com/google/devtools/build/lib/analysis/SourceManifestAction.java b/src/main/java/com/google/devtools/build/lib/analysis/SourceManifestAction.java
index 6feb2b9..e91419e 100644
--- a/src/main/java/com/google/devtools/build/lib/analysis/SourceManifestAction.java
+++ b/src/main/java/com/google/devtools/build/lib/analysis/SourceManifestAction.java
@@ -14,7 +14,6 @@
package com.google.devtools.build.lib.analysis;
import static com.google.common.collect.ImmutableList.toImmutableList;
-import static java.nio.charset.StandardCharsets.ISO_8859_1;
import static java.nio.charset.StandardCharsets.UTF_8;
import com.google.common.annotations.VisibleForTesting;
@@ -225,7 +224,7 @@
* @throws IOException
*/
private void writeFile(OutputStream out, Map<PathFragment, Artifact> output) throws IOException {
- Writer manifestFile = new BufferedWriter(new OutputStreamWriter(out, ISO_8859_1));
+ Writer manifestFile = new BufferedWriter(new OutputStreamWriter(out, UTF_8));
List<Map.Entry<PathFragment, Artifact>> sortedManifest = new ArrayList<>(output.entrySet());
sortedManifest.sort(ENTRY_COMPARATOR);
for (Map.Entry<PathFragment, Artifact> line : sortedManifest) {
diff --git a/src/main/tools/build-runfiles-windows.cc b/src/main/tools/build-runfiles-windows.cc
index 0a0b0ab..da39672 100644
--- a/src/main/tools/build-runfiles-windows.cc
+++ b/src/main/tools/build-runfiles-windows.cc
@@ -164,8 +164,8 @@
continue;
}
- size_t space_pos = line.find_first_of(' ');
wstring wline = blaze_util::CstringToWstring(line);
+ size_t space_pos = wline.find_first_of(' ');
wstring link, target;
if (space_pos == string::npos) {
link = wline;