Skip to content

Conversation

@angerman
Copy link

This is a continuation of @luite's shrinkstage1 MR.

Comment on lines +98 to +99
, stage0 `cabalFlag` "minimal"
, stage0 `cabalFlag` "no-uncommon-ncgs"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we actually should care about hadrian, or not.

@angerman angerman force-pushed the feat/minimal-stage1 branch 2 times, most recently from d627df0 to b5c275f Compare November 27, 2025 01:07
@angerman angerman force-pushed the stable-ghc-9.14.2025.11.12 branch from 257a3dc to e808dde Compare November 28, 2025 05:57
@angerman angerman force-pushed the stable-ghc-9.14.2025.11.12 branch from e808dde to 537b3a7 Compare November 28, 2025 08:40
@angerman angerman deleted the branch stable-ghc-9.14 November 29, 2025 02:16
@angerman angerman closed this Nov 29, 2025
@angerman angerman reopened this Nov 29, 2025
@angerman angerman changed the base branch from stable-ghc-9.14.2025.11.12 to stable-ghc-9.14 November 29, 2025 02:28
@angerman angerman force-pushed the feat/minimal-stage1 branch from 6d59cbf to b3cd485 Compare November 29, 2025 02:43
Copy link

@luite luite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup of the register code and organizing stubs.

In the current revision, there are some code that checks HAVE_INTERPRETER as a condition to import some GHC.StgToJS modules, even though those are guarded by a different flag in ghc.cabal.in. It's way too easy to make this kind of mistakes and correcting them by hand is tedious.

For maintainability I think it's essential to automatically prevent this.

Some ideas:

  • a flag to have GHC immediately error when an "unlisted" (i.e. not explicitly named on the command line, even if its source is available in the hs-source-dirs) module from the current package is imported
  • move every set of flag-guarded modules to a specific hs-source-dir that's only added when that flag is active

Perhaps we'd also need to have a testsuite test that checks the dependencies between the modules for when all optional components have been turned off (unless any mistake would already guarantee failure of stage1 to build)

compiler/GHC.hs Outdated
}
}
#else
-- MINIMAL build: no plugin support, no IC modifications
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a bunch of references to MINIMAL still in the source, in comments and panic messages. Maybe some LLM can sweep them up quickly.

-- | Calculate the maximum number of register colors that could be
-- denied to a node of this class due to having this virtual reg as a neighbour.
targetVirtualRegSqueeze :: Platform -> RegClass -> VirtualReg -> Int
targetVirtualRegSqueeze platform = rtVirtualRegSqueeze (selectRegTarget platform)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw some {-# INLINE virtualRegSqueeze #-} in the source that may or may not suggest that this code is performance sensitive. Given the way it's being used I don't expect the pragma to actually be beneficial. Is there any measurable performance difference before and after this change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation. Yes we'll only know once we have good performance testing ... on complex applications :(


-- Code Generation Backends
if flag(js-backend)
cpp-options: -DHAVE_JS_BACKEND
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this flag is never checked, looks like some checks are still on HAVE_INTERPRETER

@angerman angerman force-pushed the feat/minimal-stage1 branch 4 times, most recently from 05a4bf8 to f9d6769 Compare December 2, 2025 03:36
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really, really hat this module. Preferrably we wouldn't need stubs at all.

This to me signals for each of the stubs we have here, we unintentionally use some of this in places we shouldn't even use this to begin with.

Comment on lines +69 to +73
+x86-ncg +aarch64-ncg
-ppc-ncg -riscv64-ncg -loongarch64-ncg
-js-backend -wasm-backend
+llvm-backend
-interpreter -dynamic-linker
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to disable llvm-backend by default. And only enable +x86-ncg with a conditionalif arch(x86), same for aarch64?

angerman and others added 6 commits December 11, 2025 16:27
Add a new -frontend-plugins flag to control whether frontend plugin
loading support is enabled. This allows stage1 builds to disable
plugin loading since GHC.Runtime.Loader depends on the ghci package.

- Add frontend-plugins flag to ghc-bin.cabal.in (default: True)
- Guard loadFrontendPlugin and initializeSessionPlugins with CPP
- Use defaultFrontendPlugin fallback when plugins disabled
Add +minimal and +no-uncommon-ncgs flags to reduce stage1 compiler size.

The +minimal flag removes:
- Bytecode interpreter (GHC.ByteCode.*)
- JavaScript backend (GHC.StgToJS.*)
- WebAssembly backend (GHC.Wasm.*)
- GHCi and interactive features
- Template Haskell execution (can parse/typecheck, can't run)

The +no-uncommon-ncgs flag removes native code generators for:
- PowerPC (PPC)
- RISC-V 64-bit (RV64)
- LoongArch64

This commit also adds comprehensive CPP guards throughout the compiler
to support MINIMAL builds that can compile code and link executables
via the system linker without GHCi, Template Haskell execution,
bytecode interpreter, or runtime loading functionality.

Key guarded modules:
- GHC.hs: Guard interactive/GHCi exports and imports
- GHC/Driver/Main.hs: Guard plugin initialization and interpreter code
- GHC/Driver/Make.hs: Guard plugin init, add stub for addSptEntries
- GHC/Driver/Pipeline.hs: Guard JS linker
- GHC/Driver/Pipeline/Execute.hs: Guard JS-specific phases
- GHC/Driver/Session/Inspect.hs: Guard backend checks for interpreter
- GHC/Tc/Gen/Splice.hs: Add stubs for TH execution functions
- GHC/Unit/Module/Graph.hs: Make showModMsg LinkNode unconditional
- CmmToAsm/*: Guard PPC/RISCV64/LoongArch64 code generation
- GHC/Platform/Regs.hs: Guard PPC/RISCV64/LoongArch64 platform regs

Stage1 builds 801 modules (vs 885+ for full build), stage2 builds
complete successfully with full-featured compiler.

Co-authored-by: Moritz Angermann <[email protected]>
Replace the broad +minimal and +no-uncommon-ncgs flags with granular,
topic-specific flags that use positive (opt-in) HAVE_* CPP defines.

New flag system in ghc.cabal.in:

**Native Code Generator flags:**
- x86-ncg: HAVE_X86_NCG (default: True)
- aarch64-ncg: HAVE_AARCH64_NCG (default: True)
- ppc-ncg: HAVE_PPC_NCG (default: True)
- riscv64-ncg: HAVE_RISCV64_NCG (default: True)
- loongarch64-ncg: HAVE_LOONGARCH64_NCG (default: True)

**Backend flags:**
- js-backend: HAVE_JS_BACKEND (default: True)
- wasm-backend: HAVE_WASM_BACKEND (default: True)
- llvm-backend: HAVE_LLVM_BACKEND (default: True)

**Runtime feature flags:**
- interpreter: HAVE_INTERPRETER (default: True)
- dynamic-linker: HAVE_DYNAMIC_LINKER (default: True)

CPP guard updates:
- #if !defined(MINIMAL) -> #if defined(HAVE_INTERPRETER)
- #if defined(MINIMAL) -> #if !defined(HAVE_INTERPRETER)
- #if !defined(NO_UNCOMMON_NCGS) -> individual #if defined(HAVE_*_NCG)

Updated cabal.project.stage1 to use new flags:
  +x86-ncg +aarch64-ncg -ppc-ncg -riscv64-ncg -loongarch64-ncg
  -js-backend -wasm-backend +llvm-backend
  -interpreter -dynamic-linker

Stage1 now builds 801 modules (vs 885 for full build), excluding:
- ByteCode interpreter modules
- JavaScript backend (StgToJS)
- WebAssembly backend
- Uncommon NCGs (PPC, RISC-V, LoongArch64)
- GHCi/interpreter runtime support
…dant guards

This commit reduces CPP spread in the GHC NCG codebase through two approaches:

**Phase 1: Remove redundant CPP from conditional modules**

Modules that are only compiled when a cabal flag is enabled (e.g.,
`flag(interpreter)`) don't need internal CPP guards checking that same flag.

- GHC/ByteCode/Linker.hs: Remove #else stub block (module is already
  conditional via ghc.cabal.in)
- GHC/Runtime/Eval.hs: Remove 3 CPP import guards
- GHC/Runtime/Loader.hs: Remove CPP import guard

**Phase 2: Record-based dispatch pattern for NCG**

Replaced repetitive CPP in Target.hs and FreeRegs.hs with a cleaner
record-based dispatch pattern:

- GHC/CmmToAsm/Reg/Target.hs: Major refactor
  - Created `RegTarget` record bundling all 5 register operations
  - Each architecture defines its RegTarget in ONE CPP block (was 5)
  - Single `selectRegTarget :: Platform -> RegTarget` dispatch function
  - Exported functions are simple wrappers with no CPP
  - Reduces CPP blocks from ~25 to ~5 (one per arch)

- GHC/CmmToAsm/Reg/Linear/FreeRegs.hs: Applied same pattern to maxSpillSlots
  - Single `selectMaxSpillSlots` dispatch function
  - Cleaner unavailable arch handling

This pattern consolidates CPP to:
1. Conditional imports (required - modules are conditionally compiled)
2. Conditional arch-specific record definitions (one block per arch)
3. Single dispatch function with conditional entries

Net result: 59 fewer lines, cleaner code, easier to add/remove architectures.
Create GHC.Runtime.Interpreter.Stubs module containing all stub type
definitions for builds without interpreter support. This eliminates
duplication of stub types across 6+ files.

Centralized types include:
- Core GHCi types: HValue, ForeignRef, ForeignHValue, RemoteRef, RemotePtr, HValueRef
- Communication types: Pipe, LoadedDLL
- Break/debug types: BreakArray, InternalBreakpointId
- Template Haskell types: THMessage, THResultType
- Eval types: EvalExpr, ResumeContext, EvalStep
- StgToJS types: LinkPlan, StgToJSConfig
- Linker environment types: ItblEnv, AddrEnv

Updated files to import from centralized module:
- GHC/Linker/Types.hs
- GHC/Runtime/Eval/Types.hs
- GHC/Runtime/Interpreter/Types.hs
- GHC/Driver/Hooks.hs
- GHC/Hs/Expr.hs
- GHC/Tc/Gen/Splice.hs

The Stubs module is conditionally compiled via ghc.cabal.in only when
the interpreter flag is disabled (!flag(interpreter)).

This reduces code duplication and provides a single source of truth
for all stub type definitions in minimal/non-interpreter builds.
The file uses #if defined(HAVE_INTERNAL_INTERPRETER) guards but was
missing the {-# LANGUAGE CPP #-} pragma, causing a parse error.

While the module is only compiled when flag(interpreter) is enabled,
it still needs CPP for the more specific HAVE_INTERNAL_INTERPRETER
guards within the module.
Safe Haskell import checking (hscCheckSafeImports) is a compile-time
feature that should work regardless of interpreter support. The previous
CPP guard `#if !defined(HAVE_INTERPRETER)` caused stage1 builds (which
use -interpreter flag) to skip Safe Haskell checking when compiling
boot libraries.

This resulted in incorrect Safe Haskell metadata in .hi files, causing
testsuite failures for Safe Haskell tests (BadImport, Dep, ImpSafe,
safePkg01, etc.) because ghc-internal modules compiled by stage1 had
different safety inferences than expected.

The fix removes the CPP guard so hscCheckSafeImports always runs,
ensuring consistent Safe Haskell behavior regardless of whether the
compiler has interpreter support enabled.
This commit addresses all review comments from PR 118:

1. Fix HAVE_INTERPRETER vs HAVE_JS_BACKEND mismatch (Critical)
   - Split CPP guards: HAVE_INTERPRETER for bytecode/interpreter
   - Use HAVE_JS_BACKEND for StgToJS modules
   - Affected files: GHC/Driver/Main.hs, Pipeline.hs, Pipeline/Execute.hs

2. Clean up "MINIMAL build" references
   - Replace with explicit CPP flag references
   - e.g., "HAVE_INTERPRETER not defined" instead of "MINIMAL build"
   - Affected files: GHC.hs, Driver/Main.hs, Driver/Make.hs,
     Driver/Session/Inspect.hs, Driver/Pipeline/Execute.hs,
     Tc/Types.hs, Tc/Gen/Splice.hs

3. Refactor RegTarget to GADT-based approach
   - Introduce ArchKind data type for type-level architecture tags
   - Use GADT RegTarget with architecture-specific constructors
   - Add SomeRegTarget existential wrapper for runtime dispatch
   - Define RegOps typeclass with per-architecture instances
   - Add withRegTarget helper using rank-2 types
   - Add INLINE pragmas for performance
The GADT refactoring of Reg/Target exports RegOps(..) which includes
mkVirtualReg as a typeclass method. PPC/CodeGen.hs imports both PPC.Regs
(which has its own mkVirtualReg) and Reg.Target, creating an ambiguous
reference.

Fix by using an explicit import list for Reg.Target, importing only
targetClassOfReg which is the only function actually used.
- Fix JS linker CPP guards: use HAVE_JS_BACKEND instead of HAVE_INTERPRETER
  for linkJSBinary (lines 456, 474, 603 in Pipeline.hs). Linking JS
  binaries doesn't require the interpreter.

- Revert formatting change in Reg/Target.hs:298 to avoid unnecessary
  whitespace changes in an already hard-to-rebase patch.
ghc-boot depends on ghc-platform, so it must be included in the
stage1 build. This was incorrectly removed during rebase conflict
resolution.
Split the monolithic GHC.Hs.Instances module (344 Data instances, 14MB
object file) into 5 parallel-compilable sub-modules:

  - GHC.Hs.Instances.Common      - phase-independent instances
  - GHC.Hs.Instances.Transitions - LR types spanning multiple phases
  - GHC.Hs.Instances.Parsed      - GhcPs instances only
  - GHC.Hs.Instances.Renamed     - GhcRn instances only
  - GHC.Hs.Instances.Typechecked - GhcTc instances only

The wrapper module GHC.Hs.Instances re-exports all sub-modules for
backward compatibility.

Additionally, all Instance modules are now excluded from stage1 builds
(via -interpreter flag) since Data instances are only needed at runtime
for Template Haskell and GHCi, not for compilation.

Benefits:
- Stage1: Saves ~14MB object file + compile time
- Stage2: 5 modules can compile in parallel instead of 1 serial

See #9557 for why these use -O0, #18254 for the parallelism issue.
These modules were added in PR #123 (Build iserv on demand) but were
missing from the interpreter section of ghc.cabal.in after the rebase.
This module was incorrectly added during the rebase - the source file
doesn't exist in either the base branch or our PR.
The deriving instance for Data (HsModule GhcPs) requires Data instances
for HsDecl and other AST types, which are now in the GHC.Hs.Instances.*
modules. Since those modules are only included when the interpreter flag
is enabled, we must also guard this deriving instance.

This fixes the stage1 build failure:
  No instance for 'Data (HsDecl GhcPs)' arising from a use of 'k'
  In the instance declaration for 'Data (HsModule GhcPs)'
The Instance modules provide Data instances that are used throughout
the compiler (e.g., GHC.Iface.Ext.Ast for HIE file generation). They
cannot be conditional on the interpreter flag.

The split into multiple modules was for parallel compilation (#18254),
not for stage1 exclusion. Keep them unconditional so all dependent
modules can compile.

Reverts the conditional placement from the previous commit.
The split Instance modules were missing dependencies on each other:

- Common.hs: Added `import GHC.Hs.Type` for HsTypeGhcPsExt type
- Parsed.hs, Renamed.hs, Typechecked.hs: Added imports from Common
  (for HsFieldBind, HsTypeGhcPsExt instances) and Transitions (for
  StmtLR instances)

This establishes the correct dependency chain:
  Common -> Transitions -> Parsed/Renamed/Typechecked -> Instances

Fixes build errors:
- "No instance for 'Data (StmtLR GhcPs GhcPs ...)'
- "No instance for 'Data (HsFieldBind ...)'
- "Not in scope: type constructor or class 'HsTypeGhcPsExt'"
…R dependencies

The NHsValBindsLR type contains a hard-coded [LSig GhcRn] field, creating
an unavoidable dependency: HsExpr GhcPs -> HsLocalBinds -> NHsValBindsLR
-> [LSig GhcRn]. This means many GhcPs Data instances must be in Renamed.hs.

Changes:
- Move all HsLocalBinds-dependent GhcPs instances from Parsed.hs to Renamed.hs
- Move LR instances (HsLocalBindsLR, HsValBindsLR, etc.) to phase-specific modules
- Empty Transitions.hs as all instances now belong in their target phase module
- Reorganize Common.hs to only contain truly phase-independent instances

The split now follows a clear dependency hierarchy:
  Common.hs (no phase deps)
    -> Parsed.hs (GhcPs types without HsLocalBinds dependencies)
    -> Renamed.hs (GhcRn + all GhcPs types depending on HsLocalBinds/GhcRn)
    -> Typechecked.hs (GhcTc types)

This resolves #18254 build failures while maintaining parallel compilation benefits.
- Add stub object-code linker functions to GHC.Runtime.Interpreter
  that panic when called (initObjLinker, lookupSymbol, loadDLL, etc.)
- Fix InterpSymbol import to use GHC.Runtime.Interpreter.Types.SymbolCache
- Fix InterpSymbol type parameter in lookupSymbol/lookupClosure signatures
- Change resolveObjs return type from IO Bool to IO SuccessFlag
- Fix loadDLL return type to IO (Either String (RemotePtr LoadedDLL))
- Fix FixitySig import in GHC.Hs.Instances.Parsed to use source module
  (Language.Haskell.Syntax.Binds) for constructor visibility

These changes allow stage1 builds without interpreter support to compile.
Make the entire module conditional on HAVE_INTERPRETER:
- When HAVE_INTERPRETER is defined: use full implementation with all
  interpreter types (WasmInterpConfig, JSInterpConfig, etc.)
- When HAVE_INTERPRETER is not defined: provide stub InterpOpts type
  and initInterpreter that always returns Nothing

This resolves build errors:
- Ambiguous StgToJSConfig (stub vs real type conflict)
- Missing WasmInterpConfig and JSInterpConfig types
- Missing interpreter configuration field names
When building without interpreter support (-interpreter flag),
InterpOpts is a stub empty data type without any fields. The
initInterpOpts function was trying to use all 20+ InterpOpts fields
unconditionally, causing build failures.

This fix makes the entire module conditional on HAVE_INTERPRETER:
- When defined: full implementation creating InterpOpts with all fields
- When not defined: stub that returns empty InterpOpts

This allows stage1 minimal builds (without interpreter) to complete
successfully.
@angerman angerman force-pushed the feat/minimal-stage1 branch from 9a43f87 to f139792 Compare December 11, 2025 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants