Skip to content

feat: windows support [WIP]#216

Draft
JalinWang wants to merge 49 commits intomainfrom
feat/win-compilation
Draft

feat: windows support [WIP]#216
JalinWang wants to merge 49 commits intomainfrom
feat/win-compilation

Conversation

@JalinWang
Copy link
Collaborator

@JalinWang JalinWang commented Mar 12, 2026

Greptile Summary

This WIP PR adds initial Windows (MSVC) support across the zvec codebase, touching 76 files spanning the build system, platform-abstraction headers, core algorithm implementations, storage helpers, and tests.

Key changes include:

  • Build system: MSVC detection in the root CMakeLists.txt selects C++20 (required for designated-initializer support in MSVC), adds the necessary /FS, /EHsc, and /Zc:preprocessor flags, and re-routes all third-party builds (lz4, Arrow, RocksDB, glog, sparsehash) to their Windows-compatible build paths.
  • Platform portability: GCC-only constructs are systematically replaced — __builtin_bswap*_byteswap_*, posix_memalignailego_aligned_malloc, zero-length arrays [0] → C99 flexible array members [], __attribute__((format(printf,...))) → the new AILEGO_PRINTF_FORMAT macro, and POSIX headers guarded with #ifndef _MSC_VER.
  • Arrow CMakeLists regression: The non-Android CONFIGURE_COMMAND accidentally drops env ${CONFIGURE_ENV_LIST}, breaking USE_OSS_MIRROR mirror configuration for Arrow builds on all non-Android platforms.
  • Broad warning suppression: /wd4267 and /wd4244 (narrowing conversions) are globally disabled with a TODO comment acknowledging they need to be resolved; these warnings catch real truncation bugs and should be addressed at the source rather than silenced.
  • Duplicate type alias: using key_t = uint64_t is introduced in both ivf_index_format.h and sparse_utility.h within namespace zvec::core; it should live in a single shared header.
  • Duplicate warning suppression: /wd4334 is listed twice in CMakeLists.txt (once in a combined line, once standalone).

Confidence Score: 3/5

  • This WIP PR is not ready to merge: it contains a functional regression in the Arrow OSS-mirror build path and globally suppresses narrowing-conversion warnings that could hide real bugs.
  • The majority of platform-portability changes (intrinsic swaps, header guards, flexible array members, aligned-malloc wrappers, macro portability) are mechanically correct and well-scoped. However, the accidental removal of env ${CONFIGURE_ENV_LIST} from Arrow's configure command is a real regression for USE_OSS_MIRROR builds, the project-wide suppression of C4267/C4244 masks potentially real type-truncation issues with only a TODO comment, and the PR is explicitly marked [WIP]. The overall approach and direction are sound, but these issues should be resolved before merging.
  • thirdparty/arrow/CMakeLists.txt (OSS mirror regression) and CMakeLists.txt (broad warning suppression) need the most attention.

Important Files Changed

Filename Overview
CMakeLists.txt Adds MSVC/C++20 detection, Windows compile flags, and Arrow static-lib defines; contains duplicate /wd4334 suppression and broadly suppresses C4267/C4244 narrowing warnings with a TODO, which could hide real conversion bugs.
thirdparty/arrow/CMakeLists.txt Adds Windows .lib names and multi-config build support, but accidentally drops the env ${CONFIGURE_ENV_LIST} prefix from the non-Android CONFIGURE_COMMAND, breaking USE_OSS_MIRROR for Arrow builds.
src/include/zvec/ailego/internal/platform.h Moves ssize_t typedef before the extern "C" block so it is visible in C++ code, adds S_ISDIR macro for MSVC, and casts the prefetch pointer to const char* for MSVC _mm_prefetch compatibility.
src/db/common/file_helper.cc Replaces POSIX dirent/opendir/open-based file operations with std::filesystem equivalents under _MSC_VER; atomic copy via a .tmp file is preserved on Windows.
src/db/index/common/doc.cc Replaces GCC __builtin_bswap* with MSVC byteswap* intrinsics under #ifdef _MSC_VER for float16, uint32, and uint64 byte-swap operations; logic is correct.
src/include/zvec/ailego/logger/logger.h Introduces AILEGO_PRINTF_FORMAT macro that maps to attribute((format(printf,...))) on GCC/Clang and is empty on MSVC, cleanly removing the GCC-only attribute.
src/core/algorithm/hnsw/hnsw_entity.h Changes zero-length array extension neighbors[0] to standard C99/C++ flexible array member neighbors[] for MSVC portability.
src/core/algorithm/ivf/ivf_index_format.h Fixes zero-length array to flexible array member; adds node_id_t and key_t aliases that duplicate the key_t alias already added in sparse_utility.h within the same namespace.
src/core/utility/visit_filter.h Adds typedef unsigned int id_t under _MSC_VER to provide the POSIX id_t type that is unavailable on Windows.
tools/core/bench_result.h Replaces gettimeofday/timeval with std::chrono::steady_clock under _MSC_VER for wall-time measurement; both code paths produce correct millisecond durations.
thirdparty/lz4/CMakeLists.txt Adds a separate MSVC ExternalProject_Add path that builds lz4 via its own CMakeLists (build/cmake subdirectory) with /config support instead of the Unix Makefile path.
thirdparty/rocksdb/CMakeLists.txt Adds lz4::lz4 imported target and sets lz4 cache variables before including RocksDB so its FindLz4 logic picks up the pre-built lz4; also skips thirdparty.inc via ROCKSDB_SKIP_THIRDPARTY.
tests/db/index/storage/wal_file_test.cc Adds _ALLOW_KEYWORD_MACROS before the #define private public hack and provides a truncate() shim via _chsize; the keyword-redefine approach is non-standard but necessary for this test pattern on MSVC.
src/db/index/column/vector_column/engine_helper.hpp Replaces designated-initializer aggregate syntax with explicit member assignment to fix MSVC C++17 compilation (designated initializers require C++20 in MSVC).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CMake Configure] --> B{MSVC?}
    B -->|Yes| C[C++20 Standard\n/FS /EHsc /Zc:preprocessor\n/we4716 + suppress C4267 C4244 etc.]
    B -->|No| D[C++17 Standard\n-Wall -Werror=return-type]

    C --> E[Third-party builds]
    D --> E

    E --> F{lz4}
    F -->|MSVC| F1[CMake subdir build\nlz4.lib]
    F -->|Linux/Mac| F2[Makefile build\nliblz4.a]

    E --> G{Arrow}
    G -->|MSVC| G1[parquet_static.lib\narrow_static.lib etc.\n+ /FS compile flag]
    G -->|Linux/Mac| G2[libparquet.a\nlibarrow.a etc.]
    G1 --> G3[⚠️ env CONFIGURE_ENV_LIST\ndropped — USE_OSS_MIRROR broken]

    E --> H[RocksDB\n+ lz4::lz4 imported target\n+ ROCKSDB_SKIP_THIRDPARTY]
    E --> I[glog\nHAVE_SNPRINTF on MSVC]
    E --> J[sparsehash\n.windows.patch → static_assert]

    C --> K[Platform abstractions]
    K --> K1[ssize_t typedef moved\nbefore extern C]
    K --> K2[_mm_prefetch cast to const char*]
    K --> K3[S_ISDIR macro for MSVC]
    K --> K4[ailego_aligned_malloc\nreplaces posix_memalign]
    K --> K5[__builtin_bswap* →\n_byteswap_* on MSVC]
    K --> K6[Zero-length arrays 0\n→ flexible arrays]
Loading

Comments Outside Diff (1)

  1. thirdparty/arrow/CMakeLists.txt, line 63-82 (link)

    OSS mirror env vars dropped from Arrow configure command

    The env ${CONFIGURE_ENV_LIST} prefix was removed from the non-Android CONFIGURE_COMMAND. This means any mirror proxy settings set by USE_OSS_MIRROR (e.g., custom download URLs passed as env vars) will no longer be forwarded to Arrow's CMake configure step on non-Android platforms — a regression for any build that relies on USE_OSS_MIRROR.

    # Before
    CONFIGURE_COMMAND env ${CONFIGURE_ENV_LIST} "${CMAKE_COMMAND}" ...
    # After (mirror env lost)
    CONFIGURE_COMMAND "${CMAKE_COMMAND}" ${CMAKE_CACHE_ARGS} ${ARROW_EXTRA_CMAKE_ARGS} ...

    If CONFIGURE_ENV_LIST is only needed for the mirror case, you can preserve it with a conditional:

    if(USE_OSS_MIRROR AND NOT MSVC)
        set(CONFIGURE_ENV_PREFIX env ${CONFIGURE_ENV_LIST})
    endif()
    ...
    CONFIGURE_COMMAND ${CONFIGURE_ENV_PREFIX} "${CMAKE_COMMAND}" ${CMAKE_CACHE_ARGS} ${ARROW_EXTRA_CMAKE_ARGS} -DCMAKE_BUILD_TYPE=$<CONFIG> ...

Last reviewed commit: de9724b

Greptile also left 3 inline comments on this PR.

@JalinWang JalinWang marked this pull request as draft March 12, 2026 02:15
CMakeLists.txt Outdated
Comment on lines +28 to +34
add_compile_options(/wd4245 /wd4334 /wd4702 /wd4305)

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Werror=return-type")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wall -Werror=return-type")
# //warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
# keeps |= (1 << k);
add_compile_options(/wd4334)
######

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wd4334 suppressed twice

Warning C4334 ("result of 32-bit shift implicitly converted to 64 bits") is already listed in the combined add_compile_options(/wd4245 /wd4334 /wd4702 /wd4305) line above, and then suppressed a second time individually. The duplicate is harmless but indicates a copy-paste oversight.

Suggested change
add_compile_options(/wd4245 /wd4334 /wd4702 /wd4305)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Werror=return-type")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wall -Werror=return-type")
# //warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
# keeps |= (1 << k);
add_compile_options(/wd4334)
######
add_compile_options(/wd4245 /wd4702 /wd4305)
# warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
# keeps |= (1 << k);
add_compile_options(/wd4334)

Comment on lines +24 to +25
add_compile_options(/wd4146) # unary minus operator applied to unsigned type #usage: uint32_t seg_id_{-1U};
add_compile_options(/wd4310) # warning C4310: cast truncates constant value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broad narrowing-conversion warnings suppressed without fix

/wd4267 (conversion from size_t to a smaller type) and /wd4244 (general type narrowing) are silently disabled with a comment acknowledging they need to be resolved. These two warnings regularly surface real runtime bugs (e.g., size_tint truncation on 64-bit data, unintended float→int narrowing). Leaving them suppressed while the PR is merged makes it easy for future changes to introduce silent truncation bugs that are never caught.

Consider replacing the blanket suppression with targeted #pragma warning(suppress: ...) at the specific call sites that are genuinely safe, rather than disabling them project-wide.

Comment on lines 20 to 21
namespace core {

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key_t type alias defined in two headers

using key_t = uint64_t is introduced here inside namespace zvec::core, and an identical alias is also added to src/core/utility/sparse_utility.h (line 27) in the same namespace. Any translation unit that includes both headers will have a duplicate declaration. While C++ allows re-declaring a type alias to the same type, this duplication is fragile — a future change to one without the other creates a subtle mismatch. The alias should live in a single shared header (e.g. a types.h in src/core) and be included by both files.

…m a non-owning thread, which is undefined behavior caught by MSVC's debug assertions
gtest-all.obj : warning LNK4197: export '??_7TestSuite@testing@@6b@' specified multiple times; using first specification [C:\Users\Administrator\Desktop\workspace\zvec\build_vs2022\thirdparty\googletest\googletest-1.10.0\googletest\gtest.vcxproj]
gtest-all.obj : warning LNK4197: export '??_7AssertionException@testing@@6b@' specified multiple times; using first specification [C:\Users\Administrator\Desktop\workspace\zvec\build_vs2022\thirdparty\googletest\googletest-1.10.0\googletest\gtest.vcxproj]
gtest-all.obj : warning LNK4197: export '??_7UnitTest@testing@@6b@' specified multiple times; using first specification [C:\Users\Administrator\Desktop\workspace\zvec\build_vs2022\thirdparty\googletest\googletest-1.10.0\googletest\gtest.vcxproj]
gtest-all.obj : warning LNK4197: export '??_7ScopedFakeTestPartResultReporter@testing@@6b@' specified multiple times; using first specification [C:\Users\Administrator\Desktop\workspace\zvec\build_vs2022\thirdparty\googletest\googletest-1.10.0\googletest\gtest.vcxproj]
gtest-all.obj : warning LNK4197: export '??_7UnitTestImpl@
@JalinWang JalinWang force-pushed the feat/win-compilation branch from ffc9d32 to 5c61212 Compare March 16, 2026 07:23
# Conflicts:
#	cmake/bazel.cmake
#	src/ailego/math/inner_product_matrix_fp16.cc
#	src/ailego/math/inner_product_matrix_fp32.cc
#	src/ailego/math/mips_euclidean_distance_matrix_fp32.cc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant