Command Section

PMC.MIPS24K(3)         FreeBSD Library Functions Manual         PMC.MIPS24K(3)

NAME
     pmc.mips24k - measurement events for MIPS24K family CPUs

LIBRARY
     Performance Counters Library (libpmc, -lpmc)

SYNOPSIS
     #include <pmc.h>

DESCRIPTION
     MIPS PMCs are present in MIPS 24k and other processors in the MIPS
     family.

     There are two counters supported by the hardware and each is 32 bits
     wide.

     MIPS PMCs are documented in MIPS32 24K Processor Core Family Software
     User's Manual, MIPS Technologies Inc., December 2008.

   Event Specifiers (Programmable PMCs)
     MIPS programmable PMCs support the following events:

     CYCLE   (Event 0, Counter 0/1) Total number of cycles.  The performance
             counters are clocked by the top-level gated clock.  If the core
             is built with that clock gater present, none of the counters will
             increment while the clock is stopped - due to a WAIT instruction.

     INSTR_EXECUTED
             (Event 1, Counter 0/1) Total number of instructions completed.

     BRANCH_COMPLETED
             (Event 2, Counter 0) Total number of branch instructions
             completed.

     BRANCH_MISPRED
             (Event 2, Counter 1) Counts all branch instructions which
             completed, but were mispredicted.

     RETURN  (Event 3, Counter 0) Counts all JR R31 instructions completed.

     RETURN_MISPRED
             (Event 3, Counter 1) Counts all JR $31 instructions which
             completed, used the RPS for a prediction, but were mispredicted.

     RETURN_NOT_31
             (Event 4, Counter 0) Counts all JR $xx (not $31) and JALR
             instructions (indirect jumps).

     RETURN_NOTPRED
             (Event 4, Counter 1) If RPS use is disabled, JR $31 will not be
             predicted.

     ITLB_ACCESS
             (Event 5, Counter 0) Counts ITLB accesses that are due to fetches
             showing up in the instruction fetch stage of the pipeline and
             which do not use a fixed mapping or are not in unmapped space.
             If an address is fetched twice from the pipe (as in the case of a
             cache miss), that instruction willcount as 2 ITLB accesses.
             Since each fetch gets us 2 instructions,there is one access
             marked per double word.

     ITLB_MISS
             (Event 5, Counter 1) Counts all misses in the ITLB except ones
             that are on the back of another miss.  We cannot process back to
             back misses and thus those are ignored.  They are also ignored if
             there is some form of address error.

     DTLB_ACCESS
             (Event 6, Counter 0) Counts DTLB access including those in
             unmapped address spaces.

     DTLB_MISS
             (Event 6, Counter 1) Counts DTLB misses.  Back to back misses
             that result in only one DTLB entry getting refilled are counted
             as a single miss.

     JTLB_IACCESS
             (Event 7, Counter 0) Instruction JTLB accesses are counted
             exactly the same as ITLB misses.

     JTLB_IMISS
             (Event 7, Counter 1) Counts instruction JTLB accesses that result
             in no match or a match on an invalid translation.

     JTLB_DACCESS
             (Event 8, Counter 0) Data JTLB accesses.

     JTLB_DMISS
             (Event 8, Counter 1) Counts data JTLB accesses that result in no
             match or a match on an invalid translation.

     IC_FETCH
             (Event 9, Counter 0) Counts every time the instruction cache is
             accessed.  All replays, wasted fetches etc. are counted.  For
             example, following a branch, even though the prediction is taken,
             the fall through access is counted.

     IC_MISS
             (Event 9, Counter 1) Counts all instruction cache misses that
             result in a bus request.

     DC_LOADSTORE
             (Event 10, Counter 0) Counts cached loads and stores.

     DC_WRITEBACK
             (Event 10, Counter 1) Counts cache lines written back to memory
             due to replacement or cacheops.

     DC_MISS
             (Event 11, Counter 0/1) Counts loads and stores that miss in the
             cache

     LOAD_MISS
             (Event 13, Counter 0) Counts number of cacheable loads that miss
             in the cache.

     STORE_MISS
             (Event 13, Counter 1) Counts number of cacheable stores that miss
             in the cache.

     INTEGER_COMPLETED
             (Event 14, Counter 0) Non-floating point, non-Coprocessor 2
             instructions.

     FP_COMPLETED
             (Event 14, Counter 1) Floating point instructions completed.

     LOAD_COMPLETED
             (Event 15, Counter 0) Integer and co-processor loads completed.

     STORE_COMPLETED
             (Event 15, Counter 1) Integer and co-processor stores completed.

     BARRIER_COMPLETED
             (Event 16, Counter 0) Direct jump (and link) instructions
             completed.

     MIPS16_COMPLETED
             (Event 16, Counter 1) MIPS16c instructions completed.

     NOP_COMPLETED
             (Event 17, Counter 0) NOPs completed.  This includes all
             instructions that normally write to a general purpose register,
             but where the destination register was set to r0.

     INTEGER_MULDIV_COMPLETED
             (Event 17, Counter 1) Integer multiply and divide instructions
             completed.  (MULxx, DIVx, MADDx, MSUBx).

     RF_STALL
             (Event 18, Counter 0) Counts the total number of cycles where no
             instructions are issued from the IFU to ALU (the RF stage does
             not advance) which includes both of the previous two events.  The
             RT_STALL is different than the sum of them though because cycles
             when both stalls are active will only be counted once.

     INSTR_REFETCH
             (Event 18, Counter 1) replay traps (other than uTLB)

     STORE_COND_COMPLETED
             (Event 19, Counter 0) Conditional stores completed.  Counts all
             events, including failed stores.

     STORE_COND_FAILED
             (Event 19, Counter 1) Conditional store instruction that did not
             update memory.  Note: While this event and the SC instruction
             count event can be configured to count in specific operating
             modes, the timing of the events is much different and the
             observed operating mode could change between them, causing some
             inaccuracy in the measured ratio.

     ICACHE_REQUESTS
             (Event 20, Counter 0) Note that this only counts PREFs that are
             actually attempted.  PREFs to uncached addresses or ones with
             translation errors are not counted

     ICACHE_HIT
             (Event 20, Counter 1) Counts PREF instructions that hit in the
             cache

     L2_WRITEBACK
             (Event 21, Counter 0) Counts cache lines written back to memory
             due to replacement or cacheops.

     L2_ACCESS
             (Event 21, Counter 1) Number of accesses to L2 Cache.

     L2_MISS
             (Event 22, Counter 0) Number of accesses that missed in the L2
             cache.

     L2_ERR_CORRECTED
             (Event 22, Counter 1) Single bit errors in L2 Cache that were
             detected and corrected.

     EXCEPTIONS
             (Event 23, Counter 0) Any type of exception taken.

     RF_CYCLES_STALLED
             (Event 24, Counter 0) Counts cycles where the LSU is in fixup and
             cannot accept a new instruction from the ALU.  Fixups are replays
             within the LSU that occur when an instruction needs to re-access
             the cache or the DTLB.

     IFU_CYCLES_STALLED
             (Event 25, Counter 0) Counts the number of cycles where the fetch
             unit is not providing a valid instruction to the ALU.

     ALU_CYCLES_STALLED
             (Event 25, Counter 1) Counts the number of cycles where the ALU
             pipeline cannot advance.

     UNCACHED_LOAD
             (Event 33, Counter 0) Counts uncached and uncached accelerated
             loads.

     UNCACHED_STORE
             (Event 33, Counter 1) Counts uncached and uncached accelerated
             stores.

     CP2_REG_TO_REG_COMPLETED
             (Event 35, Counter 0) Co-processor 2 register to register
             instructions completed.

     MFTC_COMPLETED
             (Event 35, Counter 1) Co-processor 2 move to and from
             instructions as well as loads and stores.

     IC_BLOCKED_CYCLES
             (Event 37, Counter 0) Cycles when IFU stalls because an
             instruction miss caused the IFU not to have any runnable
             instructions.  Ignores the stalls due to ITLB misses as well as
             the 4 cycles following a redirect.

     DC_BLOCKED_CYCLES
             (Event 37, Counter 1) Counts all cycles where integer pipeline
             waits on Load return data due to a D-cache miss.  The LSU can
             signal a "long stall" on a D-cache misses, in which case the
             waiting TC might be rescheduled so other TCs can execute
             instructions till the data returns.

     L2_IMISS_STALL_CYCLES
             (Event 38, Counter 0) Cycles where the main pipeline is stalled
             waiting for a SYNC to complete.

     L2_DMISS_STALL_CYCLES
             (Event 38, Counter 1) Cycles where the main pipeline is stalled
             because of an index conflict in the Fill Store Buffer.

     DMISS_CYCLES
             (Event 39, Counter 0) Data miss is outstanding, but not
             necessarily stalling the pipeline.  The difference between this
             and D$ miss stall cycles can show the gain from non-blocking
             cache misses.

     L2_MISS_CYCLES
             (Event 39, Counter 1) L2 miss is outstanding, but not necessarily
             stalling the pipeline.

     UNCACHED_BLOCK_CYCLES
             (Event 40, Counter 0) Cycles where the processor is stalled on an
             uncached fetch, load, or store.

     MDU_STALL_CYCLES
             (Event 41, Counter 0) Cycles where the processor is stalled on an
             uncached fetch, load, or store.

     FPU_STALL_CYCLES
             (Event 41, Counter 1) Counts all cycles where integer pipeline
             waits on FPU return data.

     CP2_STALL_CYCLES
             (Event 42, Counter 0) Counts all cycles where integer pipeline
             waits on CP2 return data.

     COREXTEND_STALL_CYCLES
             (Event 42, Counter 1) Counts all cycles where integer pipeline
             waits on CorExtend return data.

     ISPRAM_STALL_CYCLES
             (Event 43, Counter 0) Count all pipeline bubbles that are a
             result of multicycle ISPRAM access.  Pipeline bubbles are defined
             as all cycles that IFU doesn't present an instruction to ALU.
             The four cycles after a redirect are not counted.

     DSPRAM_STALL_CYCLES
             (Event 43, Counter 1) Counts stall cycles created by an
             instruction waiting for access to DSPRAM.

     CACHE_STALL_CYCLES
             (Event 44, Counter 0) Counts all cycles the where pipeline is
             stalled due to CACHE instructions.  Includes cycles where CACHE
             instructions themselves are stalled in the ALU, and cycles where
             CACHE instructions cause subsequent instructions to be stalled.

     LOAD_TO_USE_STALLS
             (Event 45, Counter 0) Counts all cycles where integer pipeline
             waits on Load return data.

     BASE_MISPRED_STALLS
             (Event 45, Counter 1) Counts stall cycles due to skewed ALU where
             the bypass to the address generation takes an extra cycle.

     CPO_READ_STALLS
             (Event 46, Counter 0) Counts all cycles where integer pipeline
             waits on return data from MFC0, RDHWR instructions.

     BRANCH_MISPRED_CYCLES
             (Event 46, Counter 1) This counts the number of cycles from a
             mispredicted branch until the next non-delay slot instruction
             executes.

     IFETCH_BUFFER_FULL
             (Event 48, Counter 0) Counts the number of times an instruction
             cache miss was detected, but both fill buffers were already
             allocated.

     FETCH_BUFFER_ALLOCATED
             (Event 48, Counter 1) Number of cycles where at least one of the
             IFU fill buffers is allocated (miss pending).

     EJTAG_ITRIGGER
             (Event 49, Counter 0) Number of times an EJTAG Instruction
             Trigger Point condition matched.

     EJTAG_DTRIGGER
             (Event 49, Counter 1) Number of times an EJTAG Data Trigger Point
             condition matched.

     FSB_LT_QUARTER
             (Event 50, Counter 0) Fill store buffer less than one quarter
             full.

     FSB_QUARTER_TO_HALF
             (Event 50, Counter 1) Fill store buffer between one quarter and
             one half full.

     FSB_GT_HALF
             (Event 51, Counter 0) Fill store buffer more than half full.

     FSB_FULL_PIPELINE_STALLS
             (Event 51, Counter 1) Cycles where the pipeline is stalled
             because the Fill-Store Buffer in LSU is full.

     LDQ_LT_QUARTER
             (Event 52, Counter 0) Load data queue less than one quarter full.

     LDQ_QUARTER_TO_HALF
             (Event 52, Counter 1) Load data queue between one quarter and one
             half full.

     LDQ_GT_HALF
             (Event 53, Counter 0) Load data queue more than one half full.

     LDQ_FULL_PIPELINE_STALLS
             (Event 53, Counter 1) Cycles where the pipeline is stalled
             because the Load Data Queue in the LSU is full.

     WBB_LT_QUARTER
             (Event 54, Counter 0) Write back buffer less than one quarter
             full.

     WBB_QUARTER_TO_HALF
             (Event 54, Counter 1) Write back buffer between one quarter and
             one half full.

     WBB_GT_HALF
             (Event 55, Counter 0) Write back buffer more than one half full.

     WBB_FULL_PIPELINE_STALLS
             (Event 55 Counter 1) Cycles where the pipeline is stalled because
             the Load Data Queue in the LSU is full.

     REQUEST_LATENCY
             (Event 61, Counter 0) Measures latency from miss detection until
             critical dword of response is returned, Only counts for cacheable
             reads.

     REQUEST_COUNT
             (Event 61, Counter 1) Counts number of cacheable read requests
             used for previous latency counter.

   Event Name Aliases
     The following table shows the mapping between the PMC-independent aliases
     supported by Performance Counters Library (libpmc, -lpmc) and the
     underlying hardware events used.

     Alias                 Event
     instructions          INSTR_EXECUTED
     branches              BRANCH_COMPLETED
     branch-mispredicts    BRANCH_MISPRED

SEE ALSO
     pmc(3), pmc.atom(3), pmc.core(3), pmc.iaf(3), pmc.k7(3), pmc.k8(3),
     pmc.octeon(3), pmc.soft(3), pmc.tsc(3), pmc_cpuinfo(3), pmclog(3),
     hwpmc(4)

HISTORY
     The pmc library first appeared in FreeBSD 6.0.

AUTHORS
     The Performance Counters Library (libpmc, -lpmc) library was written by
     Joseph Koshy <jkoshy@FreeBSD.org>.  MIPS support was added by George
     Neville-Neil <gnn@FreeBSD.org>.

FreeBSD 13.1-RELEASE-p6         March 24, 2012         FreeBSD 13.1-RELEASE-p6

Command Section

man2web Home...