CN111581133A - Method, system, equipment and readable medium for multi-core memory consistency - Google Patents

Method, system, equipment and readable medium for multi-core memory consistency Download PDF

Info

Publication number
CN111581133A
CN111581133A CN202010372938.1A CN202010372938A CN111581133A CN 111581133 A CN111581133 A CN 111581133A CN 202010372938 A CN202010372938 A CN 202010372938A CN 111581133 A CN111581133 A CN 111581133A
Authority
CN
China
Prior art keywords
command
data
processor
response
chip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010372938.1A
Other languages
Chinese (zh)
Inventor
刘同强
王朝辉
李拓
周玉龙
邹晓峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010372938.1A priority Critical patent/CN111581133A/en
Publication of CN111581133A publication Critical patent/CN111581133A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication

Abstract

The invention discloses a method for consistency of multi-core memories, which comprises the following steps: receiving a command sent by a main processor, and judging whether the command is hit in a main cache; responding to the miss of the command in the main cache, and judging the command; responding to the command as a read command, sending the read command to the same cluster processor and judging whether the same cluster processor returns a data response or not; responding to the situation that the data response is not returned by the same cluster processor, sending a read command to the off-chip storage controller and judging the state bit of the off-chip storage controller; and in response to the state of the off-chip memory controller being unused, reading data of the off-chip memory to update the main cache data and returning the data to the main processor. The invention also discloses a system, a computer device and a readable storage medium for multi-core memory consistency. By designing a multi-core storage consistency protocol suitable for the embedded field, the invention improves the performance and efficiency of data processing and saves the cost.

Description

Method, system, equipment and readable medium for multi-core memory consistency
Technical Field
The present invention relates to the field of multi-core processing technologies, and in particular, to a method, a system, a device, and a readable medium for multi-core memory consistency.
Background
With the development of modern society science and technology, computers have been widely used in various fields, and with the rapid development of processor technology, the computer technology has become popular. The processor is one of the main devices in a computer, and is a core accessory in the computer. Its functions are mainly to interpret computer instructions and to process data in computer software. With the increasing demand for computing speed and computing scale, the unlimited increase of the processor computing performance of a single-processor computer system is impossible due to the limitation of chip speed and processing technology, which makes the multiprocessor technology suitable.
There is a cache of shared and private data in the multiprocessor. Private data is used by a single processor, while shared data is used by multiple processors, essentially, communication between processors is accomplished by reading and writing shared data, which makes cache coherency a necessary technique for multiple processors.
The illinois protocol MESI is a widely used cache coherency protocol that supports write-back policies. MESI is an acronym for four cache segment states Invalid, Shared, Exclusive, Modified, representing Invalid, Shared, Exclusive, and Modified, respectively. Any cache segment in a multiprocessor system is in one of these four states. The MESI protocol is a suitable state machine that can handle requests from local processors as well as broadcast information onto the bus.
In the prior art, the MESI protocol requires that each cache line have two status bits for describing which of the modified (M), exclusive (E), shared (S) or invalid (I) states the line is currently in, thereby determining its read/write behavior. Two state bits of the recorded state need to occupy the storage resources of the cache and the main memory additionally. Meanwhile, the MESI protocol is suitable for multi-core and multi-processor, and the protocol processing is more complex for processors with fewer cores. More logic resources are occupied, and the cost performance is not high for the embedded field.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, a system, a device, and a readable medium for multi-core memory consistency, which improve the performance and efficiency of data processing and save the cost by designing a multi-core memory consistency protocol suitable for the embedded field.
In view of the above, an aspect of the embodiments of the present invention provides a method for multi-core memory consistency, including the following steps: receiving a command sent by a main processor, and judging whether the command is hit in a main cache; responding to the miss of the command in the main cache, and judging the command; responding to the command as a read command, sending the read command to the same cluster processor and judging whether the same cluster processor returns a data response or not; responding to the situation that the data response is not returned by the same cluster processor, sending a read command to the off-chip storage controller and judging the state bit of the off-chip storage controller; and in response to the state of the off-chip memory controller being unused, reading data of the off-chip memory to update the main cache data and returning the data to the main processor.
In some embodiments, the method further comprises: responding to the hit of the command in the main cache, and judging the command; in response to the command being a read command, returning a data response to the host processor; and updating the main cache data in response to the command being a write command.
In some embodiments, the method further comprises: in response to the command being a write command, sending the write command to an off-chip storage controller; receiving a data response returned by the off-chip storage controller to the main cache and replacing the main cache data; writing the pre-replacement data to the off-chip memory controller and updating the pre-replacement data to the off-chip memory.
In some embodiments, sending the read command to the co-clustered processor and determining whether the co-clustered processor returns a data response comprises: and responding to the data response returned by the processor in the same cluster, updating the main cache data and returning the data response to the main processor.
In some embodiments, the method further comprises: and in response to the state of the off-chip storage controller being used, writing the data of the off-chip storage controller into the off-chip storage, and judging the state of the off-chip storage controller again.
In some embodiments, sending the read command to the clustered processors includes: sequentially sending the read commands to the processors in the same cluster;
the method further comprises the following steps: and in response to the data response returned by any one of the processors in the same cluster, stopping sending the read command to other processors in the same cluster, updating the main cache data and returning the data response to the main processor.
In some embodiments, reading data of the off-chip memory to update the main cache data and returning the data to the main processor further comprises: the state of the off-chip storage controller is updated to use.
In another aspect of the embodiments of the present invention, a system for multi-core memory consistency is further provided, including: the main cache judging module is configured to receive a command sent by the main processor and judge whether the command is hit in the main cache; the command judging module is configured to respond to the miss of the command in the main cache and judge the command; the same cluster processor judging module is configured to respond to the command as a read command, send the read command to the same cluster processor and judge whether the same cluster processor returns a data response; the off-chip storage controller judging module is configured to respond to the data response not returned by the same cluster processor, send a read command to the off-chip storage controller and judge the state bit of the off-chip storage controller; and the processing module is configured to read data of the off-chip memory to update the main cache data and return the data to the main processor in response to the state of the off-chip memory controller being unused.
In another aspect of the embodiments of the present invention, there is also provided a computer device, including: at least one processor; and a memory storing computer instructions executable on the processor, the instructions being executable by the processor to implement the method steps as above.
In a further aspect of the embodiments of the present invention, a computer-readable storage medium is also provided, in which a computer program for implementing the above method steps is stored when the computer program is executed by a processor.
The invention has the following beneficial technical effects: by designing a multi-core storage consistency protocol suitable for the embedded field, the performance and efficiency of data processing are improved, and the cost is saved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
FIG. 1 is a diagram illustrating an embodiment of a method for multi-core memory coherency according to the present invention;
FIG. 2 is a block diagram of a method for multi-core memory coherency according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
In view of the above objects, a first aspect of the embodiments of the present invention provides an embodiment of a method for multi-core memory coherency. FIG. 1 is a diagram illustrating an embodiment of a method for multi-core memory coherency according to the present invention. As shown in fig. 1, the embodiment of the present invention includes the following steps:
s1, receiving a command sent by the main processor, and judging whether the command hits in the main cache;
s2, responding to the miss of the command in the main cache, and judging the command;
s3, responding to the command as a read command, sending the read command to the processors in the same cluster and judging whether the processors in the same cluster return data response or not;
s4, responding to the data response not returned by the same cluster processor, sending the read command to the off-chip storage controller and judging the state bit of the off-chip storage controller; and
and S5, responding to the state of the off-chip memory controller as unused, reading the data of the off-chip memory to update the main cache data and returning the data to the main processor.
In this embodiment, fig. 2 is a block diagram illustrating an embodiment of a method for multi-core memory consistency according to the present invention. As shown in fig. 2, the inventive architecture includes a multi-core risc-v processor cluster and off-chip storage. The multi-core chip is a chip with risc-v as an instruction set, and internally comprises a plurality of processor clusters with 2 cores as a cluster, a cache and an off-chip storage DRAM controller. The conversion rule of the cache is as follows: the invalid state is converted into an effective state through local read/write, and the invalid state is kept through remote read/write; the valid state is converted to an invalid state by remote read/write and the valid state is maintained by local read/write. The conversion rules for an off-chip memory DRAM controller are: the unused state is converted into the used state by a read operation, and the used state is converted into the unused state by a write operation.
In this embodiment, the processor 0 sends a read command to its cache0, the controller of the cache0 queries the local cache, if not, sends a request command to the processor 1, determines whether the request is a data response, if not, the cache0 sends a read request to the off-chip DRAM controller, the off-chip DRAM controller checks whether the status bit is 0, that is, whether the status is unused, and if 0, reads the off-chip DRAM data and returns a data response to the cache0 controller, and updates the corresponding status bit of the off-chip DRAM controller. The cache0 updates the data to the cache and returns the data to processor 0, ending the process. The method ensures that only one processor has effective data of a certain address at the same time, greatly reduces conflict scenes, namely reduces multi-request processing of the same address. For the read-write request of the same address, the serial execution is not interrupted.
In some embodiments of the invention, the method further comprises: responding to the hit of the command in the main cache, and judging the command; in response to the command being a read command, returning a data response to the host processor; and updating the main cache data in response to the command being a write command.
In this embodiment, the processor 0 sends a read command to its own cache0, the controller of the cache0 queries the local cache, and if the local cache is hit, returns a data response directly, and ends the processing; the processor 0 sends a write command to its own cache0, and the controller of the cache0 queries the local cache, and if the cache hits, directly updates the data, and ends the processing.
In some embodiments of the invention, the method further comprises: in response to the command being a write command, sending the write command to an off-chip storage controller; receiving a data response returned by the off-chip storage controller to the main cache and replacing the main cache data; writing the pre-replacement data to the off-chip memory controller and updating the pre-replacement data to the off-chip memory.
In this embodiment, the processor 0 sends a write command to its own cache0, the controller of the cache0 queries the local cache, and if the write command is not hit, sends a request command to the off-chip DRAM controller, the off-chip DRAM controller returns data corresponding to the controller of the cache0, the controller of the cache0 updates the data, and writes other data that has been replaced back to the off-chip DRAM controller, the off-chip DRAM controller updates the data to the off-chip DRAM, and the processing is ended.
In some embodiments of the present invention, sending the read command to the co-clustered processor and determining whether the co-clustered processor returns a data response comprises: and responding to the data response returned by the processor in the same cluster, updating the main cache data and returning the data response to the main processor.
In this embodiment, the processor 0 sends a read command to its cache0, the controller of the cache0 queries the local cache, and if not, sends a request command to the processor 1 to determine whether the request command is a data response, and if the request command is a data response, the cache0 updates the data to the cache, returns the data response to the processor 0, and updates the corresponding status bit. The cache1 controller invalidates the corresponding status bit and terminates processing.
In some embodiments of the invention, the method further comprises: and in response to the state of the off-chip storage controller being used, writing the data of the off-chip storage controller into the off-chip storage, and judging the state of the off-chip storage controller again.
In this embodiment, the processor 0 sends a read command to its own cache0, the controller of the cache0 queries the local cache, if not, sends a request command to the processor 1, determines whether the request is a data response, if not, the cache0 sends a read request to the off-chip storage DRAM controller, the off-chip storage DRAM controller checks whether the status bit is 0, that is, whether the status is unused, if not, the status bit is 1, the status bit is used, sends a disable command to the cluster 1, the controllers of the cache2 and the cache3 in the cluster 1 detect the respective status bit, and if the address of the disable request is included, the data is stored in the off-chip storage DRAM controller, and the corresponding status bit is disabled. The off-chip memory DRAM controller updates data to the off-chip memory DRAM and returns a data response to the cache0 controller, the cache0 updates data to the cache and returns a data response to the processor 0, and the processing is finished.
In some embodiments of the invention, sending a read command to a co-clustered processor comprises: and sequentially sending the read commands to the processors in the same cluster. The method further comprises the following steps: and in response to the data response returned by any one of the processors in the same cluster, stopping sending the read command to other processors in the same cluster, updating the main cache data and returning the data response to the main processor. And responding to the situation that all the processors in the same cluster do not return data response, and then entering the subsequent step of sending the read command to the off-chip storage controller and judging the state bit of the off-chip storage controller.
In some embodiments of the present invention, reading data of the off-chip memory to update the main cache data and returning the data to the main processor further comprises: the state of the off-chip storage controller is updated to use.
It should be particularly noted that, the steps in the embodiments of the method for multi-core memory consistency described above may be mutually intersected, replaced, added, and deleted, and therefore, these methods for multi-core memory consistency transformed by reasonable permutation and combination shall also belong to the protection scope of the present invention, and shall not limit the protection scope of the present invention to the embodiments.
In view of the above object, a second aspect of the embodiments of the present invention provides a system for multi-core memory consistency, including: the main cache judging module is configured to receive a command sent by the main processor and judge whether the command is hit in the main cache; the command judging module is configured to respond to the miss of the command in the main cache and judge the command; the same cluster processor judging module is configured to respond to the command as a read command, send the read command to the same cluster processor and judge whether the same cluster processor returns a data response; the off-chip storage controller judging module is configured to respond to the data response not returned by the same cluster processor, send a read command to the off-chip storage controller and judge the state bit of the off-chip storage controller; and the processing module is configured to read data of the off-chip memory to update the main cache data and return the data to the main processor in response to the state of the off-chip memory controller being unused.
In view of the above object, a third aspect of the embodiments of the present invention provides a computer device, including: at least one processor; and a memory storing computer instructions executable on the processor, the instructions being executable by the processor to implement the method steps as above.
The invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, performs the method as above.
Finally, it should be noted that, as one of ordinary skill in the art can appreciate that all or part of the processes of the methods of the above embodiments can be implemented by a computer program to instruct related hardware, and the program of the method for multi-core memory consistency can be stored in a computer readable storage medium, and when executed, the program can include the processes of the embodiments of the methods as described above. The storage medium of the program may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.
Furthermore, the methods disclosed according to embodiments of the present invention may also be implemented as a computer program executed by a processor, which may be stored in a computer-readable storage medium. Which when executed by a processor performs the above-described functions defined in the methods disclosed in embodiments of the invention.
Further, the above method steps and system elements may also be implemented using a controller and a computer readable storage medium for storing a computer program for causing the controller to implement the functions of the above steps or elements.
Further, it should be appreciated that the computer-readable storage media (e.g., memory) herein can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. By way of example, and not limitation, nonvolatile memory can include Read Only Memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which can act as external cache memory. By way of example and not limitation, RAM is available in a variety of forms such as synchronous RAM (DRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The storage devices of the disclosed aspects are intended to comprise, without being limited to, these and other suitable types of memory.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
The various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein may be implemented or performed with the following components designed to perform the functions herein: a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination of these components. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP, and/or any other such configuration.
The steps of a method or algorithm described in connection with the disclosure herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
In one or more exemplary designs, the functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims (10)

1. A method of multi-core memory coherency, comprising the steps of:
receiving a command sent by a main processor, and judging whether the command is hit in a main cache;
responding to the command missing in a main cache, and judging the command;
responding to the command as a read command, sending the read command to the same cluster processor and judging whether the same cluster processor returns a data response or not;
responding to the situation that the data response is not returned by the same cluster processor, sending the read command to an off-chip storage controller and judging the state bit of the off-chip storage controller; and
and in response to the state of the off-chip memory controller being unused, reading data of the off-chip memory to update the main cache data and returning the data to the main processor.
2. The method of multi-core memory coherence of claim 1, further comprising:
responding to the hit of the command in a main cache, and judging the command;
in response to the command being a read command, returning a data response to the host processor;
and responding to the command as a write command, and updating the main cache data.
3. The method of multi-core memory coherence of claim 1, further comprising:
in response to the command being a write command, sending the write command to an off-chip storage controller;
receiving a data response returned by the off-chip storage controller to the main cache and replacing the data of the main cache;
writing the pre-replacement data to the off-chip memory controller and updating the pre-replacement data to an off-chip memory.
4. The method of claim 1, wherein sending the read command to a co-clustered processor and determining whether the co-clustered processor returns a data response comprises:
and responding to a data response returned by the same-cluster processor, updating the main cache data and returning the data response to the main processor.
5. The method of multi-core memory coherence of claim 1, further comprising:
in response to the state of the off-chip memory controller being used, writing cache data of a processor cluster using the off-chip memory to the off-chip memory, and updating the state of the off-chip memory controller to be unused.
6. The method of claim 1, wherein sending the read command to the processors in the same cluster comprises: sequentially sending the read commands to processors in the same cluster;
the method further comprises the following steps: and in response to a data response returned by any one of the processors in the same cluster, stopping sending the read command to other processors in the same cluster, updating the main cache data and returning the data response to the main processor.
7. The method of multi-core memory coherence of claim 1, wherein reading off-chip memory data to update the main cache data and returning the data to the main processor further comprises: updating the state of the off-chip storage controller to use.
8. A system for multi-core memory coherency, comprising:
the main cache judging module is configured to receive a command sent by the main processor and judge whether the command hits in the main cache;
the command judging module is configured to respond to the miss of the command in the main cache and judge the command;
the same cluster processor judging module is configured to respond to the command as a read command, send the read command to the same cluster processor and judge whether the same cluster processor returns a data response;
the off-chip storage controller judging module is configured to respond to the data response not returned by the same cluster processor, send the read command to the off-chip storage controller and judge the state bit of the off-chip storage controller; and
and the processing module is configured to read data of the off-chip memory to update the main cache data and return the data to the main processor in response to the state of the off-chip memory controller being unused.
9. A computer device, comprising:
at least one processor; and
a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of any of the methods 1-7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202010372938.1A 2020-05-06 2020-05-06 Method, system, equipment and readable medium for multi-core memory consistency Pending CN111581133A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010372938.1A CN111581133A (en) 2020-05-06 2020-05-06 Method, system, equipment and readable medium for multi-core memory consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010372938.1A CN111581133A (en) 2020-05-06 2020-05-06 Method, system, equipment and readable medium for multi-core memory consistency

Publications (1)

Publication Number Publication Date
CN111581133A true CN111581133A (en) 2020-08-25

Family

ID=72122713

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010372938.1A Pending CN111581133A (en) 2020-05-06 2020-05-06 Method, system, equipment and readable medium for multi-core memory consistency

Country Status (1)

Country Link
CN (1) CN111581133A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463528A (en) * 2020-11-20 2021-03-09 苏州浪潮智能科技有限公司 In-band and out-band data interaction method, device, equipment and readable medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101430664A (en) * 2008-09-12 2009-05-13 中国科学院计算技术研究所 Multiprocessor system and Cache consistency message transmission method
CN102270180A (en) * 2011-08-09 2011-12-07 清华大学 Multicore processor cache and management method thereof
CN103049422A (en) * 2012-12-17 2013-04-17 浪潮电子信息产业股份有限公司 Method for building multi-processor node system with multiple cache consistency domains
US20130254488A1 (en) * 2012-03-20 2013-09-26 Stefanos Kaxiras System and method for simplifying cache coherence using multiple write policies
CN105740164A (en) * 2014-12-10 2016-07-06 阿里巴巴集团控股有限公司 Multi-core processor supporting cache consistency, reading and writing methods and apparatuses as well as device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101430664A (en) * 2008-09-12 2009-05-13 中国科学院计算技术研究所 Multiprocessor system and Cache consistency message transmission method
CN102270180A (en) * 2011-08-09 2011-12-07 清华大学 Multicore processor cache and management method thereof
US20130254488A1 (en) * 2012-03-20 2013-09-26 Stefanos Kaxiras System and method for simplifying cache coherence using multiple write policies
CN103049422A (en) * 2012-12-17 2013-04-17 浪潮电子信息产业股份有限公司 Method for building multi-processor node system with multiple cache consistency domains
CN105740164A (en) * 2014-12-10 2016-07-06 阿里巴巴集团控股有限公司 Multi-core processor supporting cache consistency, reading and writing methods and apparatuses as well as device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463528A (en) * 2020-11-20 2021-03-09 苏州浪潮智能科技有限公司 In-band and out-band data interaction method, device, equipment and readable medium

Similar Documents

Publication Publication Date Title
US9858186B2 (en) Conditional data caching transactional memory in a multiple processor system
US8762651B2 (en) Maintaining cache coherence in a multi-node, symmetric multiprocessing computer
JP5348429B2 (en) Cache coherence protocol for persistent memory
US20150058570A1 (en) Method of constructing share-f state in local domain of multi-level cache coherency domain system
US8423736B2 (en) Maintaining cache coherence in a multi-node, symmetric multiprocessing computer
CN112612727B (en) Cache line replacement method and device and electronic equipment
CN111742301A (en) Logging cache inflow to higher level caches by request
US7620954B2 (en) Mechanism for handling load lock/store conditional primitives in directory-based distributed shared memory multiprocessors
KR100515059B1 (en) Multiprocessor system and method to maintain cache coherence therefor
CN112256604B (en) Direct memory access system and method
EP3404537A1 (en) Processing node, computer system and transaction conflict detection method
US20170199819A1 (en) Cache Directory Processing Method for Multi-Core Processor System, and Directory Controller
US20140297957A1 (en) Operation processing apparatus, information processing apparatus and method of controlling information processing apparatus
US6526480B1 (en) Cache apparatus and control method allowing speculative processing of data
JP6343722B2 (en) Method and device for accessing a data visitor directory in a multi-core system
CN111581133A (en) Method, system, equipment and readable medium for multi-core memory consistency
JP5996828B2 (en) Method and apparatus for improving the performance of semaphore management sequences over a coherent bus
US20140289481A1 (en) Operation processing apparatus, information processing apparatus and method of controlling information processing apparatus
CN115269199A (en) Data processing method and device, electronic equipment and computer readable storage medium
US11360906B2 (en) Inter-device processing system with cache coherency
US10489292B2 (en) Ownership tracking updates across multiple simultaneous operations
JPWO2007099614A1 (en) System controller and cache control method
JPH052534A (en) Hierarchical cache memory device
US11947455B2 (en) Suppressing cache line modification
CN110727465B (en) Protocol reconfigurable consistency implementation method based on configuration lookup table

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200825