Predication facilitates high-bandwidth fetch and large static scheduling regions, but has typically been too complex to implement comprehensively in out-of-order microarchitectures. This paper describes dataﬂow predication, which provides per-instruction predication in a dataﬂow ISA, low predication computation overheads similar to VLIW ISAs, and low complexity out-of-order issue. A two bit ﬁeld in each instruction speciﬁes whether an instruction is predicated, in which case, an arriving predicate token determines whether an instruction should execute. Dataﬂow predication incorporates three features that reduce predication overheads. First, dataﬂow predicate computation permits computation of compound predicates with virtually no overhead instructions. Second, early mispredication termination squashes in-ﬂight instructions with false predicates at any time, eliminating the overhead of falsely predicated paths. Finally, implicit predication mitigates the fan out overhead of dataﬂow predicates by reducing the number of explicitly predicated instructions, by predicating only the heads of dependence chains. Dataﬂow predication also exposes new compiler optimizations–such as disjoint instruction merging and path-sensitive predicate removal–for increased performance of predicated code in an out-of-order design.